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ABSTRACT 


The  problem  of  obtaining  parametric  models  for  linear 
and  nonlinear  systems  based  on  observations  of  the  input 
and  output  of  the  system  is  one  of  wide  ranging  interest. 
For  linear  systems,  moving  average  (MA)  and  autoregressive 
(AR)  models  have  received  considerable  attention  and,  based 
on  the  Levinson  algorithm,  a  number  of  very  powerful  methods 
involving  lattice  filter  structures  have  been  developed  to 
obtain  the  model  solutions.   For  nonlinear  systems  the 
Volterra  series  model  which  is  a  nonlinear  extension  of  the 
moving  average  model  is  frequently  used. 

The  purpose   of  this  research  is  to  extend  these  tech- 
niques to  more  general  linear  and  nonlinear  models.   Using 
the  equation  error  formulation,  lattice  solution  methods 
in  batch  processing  and  adaptive  form  are  developed  for  both 
single  and  multichannel  autoregressive  moving  average  (ARMA) 
models  for  linear  systems  and  Volterra  series  models  for 
nonlinear  systems.   A  nonlinear  extension  of  the  ARMA  model 
is  also  considered  and  is  shown  in  some  cases  to  remedy 
problems  encountered  in  Volterra  modeling  of  nonlinear  sys- 
tems.  Lattice  methods  are  also  developed  for  the  nonlinear 
ARMA  model  and  it  is  shown  that  the  methods  obtained  for 
linear  ARMA  modeling  follow  as  a  special  case  of  the  non- 
linear results , 

Experimental  verification  of  the  methods  developed  for 
single  channel  linear  ARMA  modeling  is  presented  and  used 


to  explore  the  characteristics  of  the  lattice  solution 
techniques.   The  results  clearly  indicate  that  the  lattice 
methods  are  extremely  powerful,  capable  of  producing  highly 
accurate  system  models  using  short  runs  of  data. 
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I.   INTRODUCTION 

Traditionally,  man  has  attempted  to  create  models  of 
portions  of  his  environment  for  two  principal  reasons: 

1.  To  gain  insight  and  understanding  as  to  their 
functioning ; 

2,  As  a  prelude  to  taking  some  action  such  as  attempting 
to  exercise  control  over  them. 

The  field  of  physics  for  instance,  is  replete  with  examples 
where  men  have  created  models  to  study  and  explain  phenomenon 
from  planetary  motion  to  the  motion  and  even  origin  of  sub- 
atomic particles.   In  designing  machines,  engineers  routinely 
rely  on  models  of  the  components  they  use  to  describe  how  they 
will  function,  and  to  obtain  the  desired  results  in  the  final 
product.   Economics  is  another  field  where  the  use  of  models 
abounds  for  such  purposes  as  identifying,  forecasting  or 
trying  to  direct  trends. 

The  scope  of  the  modeling  problem  is  quite  broad  be- 
gining  with  a  decision  on  the  type  of  model  to  be  used,  what 
physical  quantities  to  measure,  how  to  estimate  the  para- 
meters of  the  model  from  the  measurement,  and  finally  a  veri- 
fication of  the  model.   In  the  chapters  that  follow,  one 
facet  of  this  problem,  that  of  estimating  model  parameters, 
will  be  explored  in  detail  for  a  number  of  linear  and  non- 
linear models . 


A.   THE  PROBLEM  STATEMENT 

The  primary  concern  of  this  work  is  the  determination  of 
discrete  time  models  for  both  linear  and  nonlinear,  time 
invariant  systems  from  sampled  observations  of  the  system 
inputs  and  outputs,   The  general  approach  underlying  all  of 
the  models  examined  here  assumes  that  the  system  to  be 
modeled  is  described  by  the  equation 


y(k)  =  F  0&i(k)]  +  F   [y(k-l)]  +  F30[u(k),y(k-D]      (1.1) 


where  F, n ,  F?n  and  F»n  are  functions  of  past  and  present 
values  of  their  arguments,  and  u(k)  and  y(k)  are  the  system 
input  and  output  respectively.   This  is  depicted  in  Figure 
(1.1).   A  possible  method  for  modeling  this  type  of  system 
is  to  create  a  model  of  exactaly  the   same  configuration  with 
functions  F,  n ,  F?n  and  Fqf],   assume  a  form  for  these  functions, 
operate  the  system  and  model  in  parallel  with  the  same  input 
and  adjust  the  parameters  of  the  model  functions  to  minimize 
the  mean  square  error  (MSE)  between  the  model  output  y(k) 
and  the  system  output.   The  symbol  "*"  is  used  here  to  indi- 
cate that  the  signal  is  an  estimate  of  y(k).   This  is  depicted 
in  Figure  1.2  and  is  often  referred  to  as  direct  form  modeling 
since  the  assumed  topology  of  the  system  is  directly  copied 


Script  notation  will  be  used  to  refer  to  quantities 
associated  with  the  system  while  nonscript  notation  will  be 
used  for  their  corresponding  approximants  in  the  model. 
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u(k) 


*  y(k) 


Figure  1.1.   The  assumed  form  for  systems  to  be 
modeled.   T  represents  a  unit  delay 


u(k) 


t     1 1 


MODEL 


+ 


y(k) 


e(k) 


Figure  1.2.   A  direct  approach  to  system  modeling. 


by  the  model.   Here  the  model  output  is  given  by 


y(k)=F10Cu(k)]+F20[y(k-l)]+F30[u(k) ,y(k-l)]        (1.2) 
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and  the  error  signal  is  referred  to  as  the  "output  error". 
As  an  example,  if  the  system  is  linear  an  appropriate  model 
is 

N 

)  (1.3a) 


F1QCu(k)]  =  2-rf  a(i)u(k-i 


i=0 

N 

F20Cy(k-l)]  =  /  a    Mi)  y(k-i)  (1.3b) 

i=l 


F3Q[u(k),y(k-l)]  =  0  (1.3c) 

(for  linear  models,  F3QC*]  will  be  zero).   In  general 
however,  the  direct  form  approach  will  have  serious  diffi- 
culties if  either  F~0[#]  or  F~n[«]  are  nonzero  since  the 

2  U         3  0 

past  values  of  y(k)  used  in  these  functions  are  themselves 
dependent  on  the  model  parameters,   A  mimimum  mean  square 
output  error  approach  results  in  a  system  of  highly  non- 
linear simultaneous  equations  which  must  be  solved  to 
obtain  the  model  parameters. 

To  avoid  these  difficulties,  the  equation  error  approach 
[Refs,  3  4-  and  23]  to  system  modeling  (which  uses  different 
model  forms  in  the  analysis  and  synthesis  phases)  will  be 
applied  to  the  problem.   The  analysis  model  is  depicted  in 
Figure  1.3  and  differs  from  the  direct  form  model  in  that 
F2Q  and  F__  are  functions  of  past  and  present  values  of  the 
delayed  system  output  rather  than  the  analysis  model  output. 
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u(k) 


ANALYSIS 
MODEL 


y(k) 


+ 


(£)       *  e(k) 


y(k) 


Figure  1.3,   The  equation  error  approach  for 
system  modeling. 


For  each  of  the  models  studied,  a  general  form  for  the  three 
functions  is  assumed  and  the  parameters  of  the  model  (coef- 
ficients of  the  functions)  are  set  to  obtain  a  MMSE  solution 
In  each  case,  the  MSE  cost  function  is  a  quadratic  function 
of  the  model  parameters  (due  to  both  the  equation  error 
formulation  and  to  the  form  chosen  for  the  functions)  with 
a  unique  minimum  and  therefore  the  solution  is  given  by  a 
system  of  linear  equations.   The  synthesis  model  is  of  the 
same  form  assumed  for  the  system  in  Figure  1.1  and  uses  the 
functions  F . _,  F?n  and  Fqn  determined  during  the  analysis 
phase.  , 
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As  an  alternative  to  the  topology  shown  in  Figure  1.3 
it  will  occasionally  be  convenient  to  consider  the  error 
signal  e(k)  as  the  output  of  the  analysis  model  rather  than 
the  prediction  y(k) ,   This  can  be  accomplished  in  one  of  two 
ways  by  defining 

F2Cy(k)]  =  y(k)  -  F2Q[y(k-l)]  (1.4a) 

or 


F3Cu(k) ,y(k)]  =  y(k)-F3Q[u(k) ,y(k-l)]  (1.4b) 


and  reformulating  the  analysis  model  as  shown  in  Figures 
1.4a  or  1.4b.   These  model  forms  are  often  referred  to  as 
prediction  error  models  since  their  outputs  are  the  errors 
in  predicting  y(k)  rather  than  the  predictions  themselves. 
There  are,  however,  no  substantive  differences  between  the 
modeling  approaches  depicted  in  Figure  1.3,  1.4a  and  1.4b. 

The  equation  error  formulation  can  be  generalized  to 
multiple  input  multiple  output  systems  as  well  (henceforth 
referred  to  as  multichannel  systems)  by  considering  u(k)  and 
y(k)  as  vectors  of  Q.  input  signals  and  Qn  output  signals 
and  F, n ,  F  n,  F   ,  F~  and  F  as  vector  functions.   The 
prediction  error  signal  e(k)  becomes  a  Q  -vector  of  signals 

and  the  model  parameters  can  be  set  to  minimize  the  trace 

T 
of  the  prediction  error  covariance  matrix  P  =  s{e(k)e(k)  }, 

Such  generalizations  have  been  developed  to  a  degree  in  the 

multichannel  filtering  literature  and  will  be  investigated 

further  here. 
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^e(k) 
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Figure  1.M-,   Prediction  error  model  forms 
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It  is  important  to  keep  in  mind  however,  that  while  the 
equation  error  formulation  can  be  used  to  find  a  model  solu- 
tion, it  is  an  indirect  method  as  opposed  to  the  direct  form 
method  which  minimizes  the  mean  square  value  of  output  error. 
The  direct  form  model  has  been  modified  to  obtain  the  equa- 
tion error  analysis  model  so  that  the  parameters  can  be  ob- 
tained via  the  solution  of  systems  of  linear  equations.   The 
price  paid  for  this  simplification  in  the  model  analysis 
problem  is  that  additive  noise  on  the  measured  system  out- 
put will  introduce  a  bias  in  the  model  coefficient  estimates. 

B.   OVERVIEW 

Chapter  II  along  with  appendices   A  through  F  provide  a 
unified  review  of  the  existing  background  theory  on  minimum 
mean  square  equation  error  modeling  of  linear  systems.   The 
moving  average  (MA)  and  autoregressive  (AR)  models  are  pre- 
sented and  their  relative  merits  compared.   In  Section  II. C. 
the  Levinson  algorithm  [Refs.  9,  10  and  27]  for  the  AR  and 
MA  models  is  developed,  greatly  simplifying  the  solution 
process  for  these  models.   Section  II. D,  then  shows  that  the 
Levinson  algorithm  defines  the  AR  and  MA  models  in  terms  of 
lattice  filter  structures. 

These  lattice  structures  have  received  widespread  atten- 
tion and  have  led  to  a  host  of  new  developments  in  modeling, 
spectral  estimation,  filter  structures  and  adaptive  filtering. 
Examination  of  the  properties  of  these  forms  have  suggested 
a  number  of  new  methods  for  calculating  model  coefficients 


that  offer  increased  accuracy,  and  in  some  cases  guarantee 
model  stability  even  in  the  presence  of  correlation  estimates 
obtained  by  averaging  over  short  time  intervals  [Refs.  5, 
20,  29  and  36].  Applied  by  Burg  [Ref.  5]  to  spectral  esti- 
mation, these  methods  allow  the  determination  of  power 
spectra  via  AR  modeling  from  very  short  runs  of  data  without 
any  need  for  the  use  of  a  window  function.   In  finite  pre- 
cision arithemetic  implementations ,  the  lattice  structures 
have  been  shown  by  Markel  and  Gray  [Ref.  33]  to  be  less  sensi- 
tive to  roundoff  noise  and  coefficient  quantization  than 
direct  structures  and  have  led  to  the  design  of  other 
structures  that  offer  improved  performance  over  conventional 
parallel  realizations.   Griffiths  has  shown  that  these 
lattices  can  be  implemented  adaptively  [Refs.  16,  17  and  18] 
and  that  they  offer  the  potential  for  more  rapid  convergence 
than  conventional  LMS  adaptive  filters.   Recently  Morf  [Refs. 
36,  37  and  38]  has  also  used  these  lattice  structures  to 
implement  a  recursively  updated  deterministic  least  squares 
adaptive  scheme.   It  is  readily  apparent  therefore,  that  the 
original  work  of  Levinson  and  the  lattice  structures  that 
have  evolved  from  it  have  had  an  important  impact  on  the 
field  of  digital  signal  processing, 

In  Section  II. E.,  the  multichannel  generalization  of  many 
of  the  single  channel  AR  and  MA  modeling  results  is  presen- 
ted.  After  a  discussion  of  the  basic  multichannel  AR  and  MA 
models  [Refs.  26  and  45],  the  multichannel  version  of  the 
Levinson  algorithm  originally  developed  by  Whittle  [Ref.  56], 
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and  Wiggins  and  Robinson  [Ref,  61]  is  presented.   A  new 
form  for  the  models  is  introduced  and  used  here  however, 
to  facilitate  the  application  of  these  results  later  in 
various  other  modeling  problems.   Multichannel  lattice 
structures  are  then  derived  and  from  them   alternative 
solution  methods  for  the  modeling  problems  are  developed. 

Finally,  in  Section  II, F.  the  LMS  adaptive  algorithm 
[Ref.  58]  is  reviewed  and  the  adaptive  implementations  of 
the  lattice  structures  due  to  Griffiths  are  presented. 

In  Chapter  III,  the  more  general  autoregressive  moving 
average  model  is  presented  using  the  equation  error  formu- 
lation attributed  to  Kalman   [Ref.  23].  After  a  brief 
discussion  of  the  model,  new  model  transition  formulas 
are  developed  showing  how  the  ARMA  model  is  related  to  the 
simpler  and  less  general  AR  and  MA  models.   System  input 
signal  requirements  for  the  ARMA  modeling  process  are  ex- 
plored and  it  is  shown  that  the  power  spectrum  of  the  input 
signal  can  be  considered  as  a  frequency  dependent  weighting 
function  in  the  model  optimization.   Then  the  main  result 
of  the  chapter  is  presented.   With  suitable  assumptions, 
a  recursive  in  order  solution  method  for  ARMA  modeling 
(the  (n+l)=st  order  solution  is  obtained  from  the  n-th 
order  solution)  is  obtained  based  on  the  Levinson  algorithm  for 
multichannel  AR  models .   From  this  ,  lattice  solution  methods 
for  the  ARMA  model  are  developed  in  both  batch  processing 
and  adaptive  form,   (Batch  processing  here  refers  to  assuming 
ergodicity  and  estimating  correlations  with  time  averaging) . 
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A  similar  result  has  recently  been  presented  by  Morf  [Refs. 
37  and  38]  with  the  assumption  of  a  white  noise  input  sig- 
nal to  the  system.   The  results  presented  here  follow  from  a 
different  approach  without  the  assumption  of  a  white  noise 
input.   Experimental  results  are  also  presented  verifying 
the  methods  and  theory,  and  showing  their  advantages  (and 
disadvantages)  over  conventional  ARMA  modeling  methods. 
The  programs  used  in  these  simulations  are  listed  in  Appen- 
dix J.   In  Section  III.F. ,  and  Appendix  G,  it  is  shown  that 
these  single  channel  methods  readily  extend  to  the  multi- 
channel ARMA  model,  and  as  one  would  expect,  can  be  obtained 
as  a  special  case. 

In  Chapter  IV  two  types  of  nonlinear  models ,  the  Volterra 
series  model  and  the  new  nonlinear  ARMA  model  recently  pro- 
posed by  Parker  [Ref.  64],  are  considered.   After  a  brief 
discussion  of  the  Volterra  model,  it  is  shown  that  the 
solution  can  be  obtained  using  multichannel  MA  lattice  methods 
if  the  regular  form  of  the  Volterra  kernels  is  used  in  place 
of  the  conventional  symmetric  form.   Then  the  nonlinear  ARMA 
model  is  presented  in  Section  IV. B.  and  it  is  shown  that  for 
many  systems,  this  model  can  remedy  the  problem  of  the  large 
number  of  terms  (ideally  infinite)  required  by  the  Volterra 
model  to  represent  the  system  in  much  the  same  way  that  the 
ARMA  model  solved  the  problem  arising  in  the  MA  model.   In 
Section  IV. B. 2  it  is  also  shown  that  by  using  the  regular 
form,  the  solution  for  the  nonlinear  ARMA  model  can  be 
obtained  using  multichannel  AR  lattice  methods  and  that  the 
linear  ARMA  model  solutions  developed  in  Chapter  III  follow 
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as  a  special  case.   Appendix  I  then  presents  two  examples  of 
nonlinear  ARMA  modeling.   First  a  somewhat  academic  example 
of  a  cascade  of  linear  and  nonlinear  subsystems  is  given 
then  a  nonlinear  ARMA  model  is  proposed  for  the  tracking 
behavior  of  a  phase  locked  loop. 

Finally,  in  Chapter  V,  two  applications  for  the  linear 
and  nonlinear  ARMA  modeling  methods  developed  in  Chapters 
III  and  IV  are  discussed  briefly.   (They  are  reduced  order 
modeling  of  complex  systems  and  modeling  for  fault  detection 
and  diagnosis.)   Then  in  Section  V.B.  conclusions  are  drawn 
on  the  results  of  this  work  and  a  list  of  significant  open 
questions  (both  old  unanswered  questions  and  new  ones 
raised  here)  is  compiled. 
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II.   DISCRETE  TIME  LINEAR  SYSTEM  MODELING;  BACKGROUND  THEORY 

While  few  physical  systems  are  absolutely  linear,  linear 
models  often  suffice  to  accurately  describe  their  behavior 
under  normal  operating  conditions.   A  rich  body  of  theory 
has  therefore  been  developed  for  the  analysis  and  modeling 
of  linear  systems   [Ref.  22]   and  a  thorough  knowledge  of 
this  theory  is  vitally  important  to  anyone  interested  in 
understanding  the  functioning  of  these  systems.   The  con- 
tinuing expansion  in  the  availability  of  powerful,  inexpen- 
sive digital  computing  capabilities  has  also  made  discrete 
time  techniques  take  on  a  special  prominence.   With  this 
as  motivation,  the  portion  of  the  background  theory  in 
discrete  time  linear  modeling  upon  which  much  of  the  re- 
mainder of  this  work  depends ,  is  developed  here  from  the 
unifying  standpoint  of  a  minimum  mean  square  equation  error 
model  solution. 

The  moving  average  and  the  autoregressive  models  are 
developed  first  for  single  input  single  output  systems. 
Their  solution  via  the  Levinson  algorithm  is  presented  and 
from  this  algorithm   alternate  solution  methods  based  on 
lattice  filter  structures  are  derived.   It  is  shown  that 
almost  all  of  these  results  can  be  generalized  to  the 
multiple  input  multiple  output  case  and  the  corresponding 
multichannel  modeling  methods  are  developed.   Finally, 
adaptive  implementation  of  the  modeling  methods  for  both 
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the  conventional  filter  structures  and  the  lattice  filter 
structures  is   presented  as  an  alternative  means  of  obtaining 
model  solutions. 

A.   MOVING  AVERAGE  MODELS 

The  moving  average  (MA)  model  was  among  the  earliest 
discrete  models  developed,  [Refs.  4-,  11  and  19]   It  estimates 
the  current  value  of  the  output  of  a  system  as  a  weighted 
combination  of  the  present  value  and  N  past  values  of  the 
system  input  where  N  is  the  order  of  the  model.   The  problem 
then  is  to  estimate  the  weighting  function  or  impulse  re- 
sponse of  the  MA  model  in  some  fashion.   Since  the  MA  model 
characterizes  a  system  in  terms  of  a  finite  duration  approx- 
imation of  its  impulse  response  and  since  any  linear  time 
invariant,  single  input  single  output  system  is  completely 
specified  by  its  impulse  response,  the  MA  model  is  quite 
general  and  can  be  used  for  a  wide  class  of  systems.   De- 
fining (N+l) -vectors  of  model  weights  and  input  data  as 


T 
a+   =  [a(0)  • • •  aCN)]  X>2  (2.1a) 


A  superscript  "+"  is  used  to  indicate  that  in  spite  of 
the  fact  that  these  vectors  are  used  for  a  N-th  order  model, 
they  are  (N+l) -vectors  with  elements  indexed  from  zero  to  N 
rather  than  from  one  to  N.   Superscript  T  demotes  transpose. 

2  .     .  .    . 

Superscripts  in  parenthesis  will  later  be  added  to  the 

model  coefficient  vectors  to  explicitly  indicate  their  de- 
pendence on  the  order  of  the  problem  being  solved.   They  are 
omitted  for  simplicity  however  whenever  doing  so  does  not 
result  in  ambiguous  notation. 
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u  (K)   =  Cu(k) 


u(k-N)] 


(2.1b) 


the  MA  estimate  of  the  system  output  becomes 


N 

y(k)  =  £  a(n)u(k-n) 
n=0 


T 
u+(k)   a+ 


In  terms  of  the  modeling  approach  of  Figure  1.3,  F    and 
F~n  are  assumed  to  be  zero.   F,  is  a  linear  time  invariant 
function  of  past  and  present  values  of  u(k) .   Assuming  sta- 
tionarity,  an  expression  for  the  mean  square  value  of  the 
error  as  a  quadratic  function  of  the  weights  (a(n)}  is 
given  by 


T 
+         4 

E~  =  a   R  +  +a 

u  u 


-2a 


r     +  R   (0) 
uu 


(2.2) 


u  y 


t« 


where  in  general  R   (n)  =  £  {v (k)w(k+n) } ,  r    =  s  (v(k)w(k+n) } , 
°        vw  -vw     —  ' 

T 

Rvw  =  e{v(k)w(k)  }  and  e{  }  denotes  expectation. 


£  +  + 

u  u 


R   (0)  • • •  R   (-N) 
uu         uu 


R   (N)  • • •  R   (0) 

uu         uu 


r     =  [R   (0) 

-  +      uy 
u  y      J 


R   (N)] 
uy 


The  surface  described  by  equation  2.2  can  be  pictured  as  a 
concave  hyperparaboloid  with  a  unique  minimum  and  the 
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characteristics  of  such  a  surface  are  described  in  Appendix 
F.  For  example  when  N=l,  the  MSE  as  a  function  of  a(0)  and 
a(l)  appears  as  shown  in  Figure  2.1. 


t 


a(0)- 


Figure  2.1.   MSE  as  a  function  of  model  weights 
for  a  first  order  (N=l)  MA  model. 


The  minimum  mean  square  error  solution  for  the  coeffi- 
cients is  given  by 


-  +  +  -  OPT 

u  u 


£  + 

u  y 


(2.3) 
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and  the  corresponding  minimum  value  of  the  cost  function 
E2  is 

+  T 
E2      =   R   (0)  -  aQpT   r  +  (2.4) 

mm      ^  u  y 


Equation  (2.3)  is  a  discrete  time  matrix  form  of  the  Wiener 
Hopf  equation 


■/ 


R   (t)  =   /   R   (t-A)  hQ)  dX  (2.5) 

uy       J        uu 


where  u(t)  and  yd)  are  the  continuous  input  and  output 

signals  and  h(x)  is  the  system  impulse  response.   The  pro- 

+ 

cess  of  finding  a^pT  in  equation  ( 2 .  3)  is  the  discrete  time 

equivalent  of   deconvolving  the  input  autocorrelation  func- 
tion from  the  cross  correlation  of  input  and  output  to 
obtain  the  system  impulse  response  in  equation  (2.5). 
Consequently  the  MA  modeling  process  has  been  called  dis- 
crete Wiener  filtering  or  stochastic   deconvolution . 

This  model  constitutes  a  direct  form  approach  as  defined 
in  section  1.1  but  does  not  encounter  difficulty  in  obtaining 
the  model  weights  since  both  F?n  and  F~n  are  assumed  to  be 
zero.   As  such,  it  possesses  the  advantage  that  the  estimates 
of  the  model  parameters  will  not  be  biased  by  the  presence  of 
additive  noise  on  the  output  of  the  system  as  shown  in 
Figure  2.2,  as  long  as  the  noise  is  uncorrelated  with  the 
input  signal.   This  can  readily  be  seen  by  replacing  y  in 
equation  2.3  by  y  +  v.   Additive  noise  on  the  input  signal 
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Figure  2.2.   Moving  Average  Modeling 

however  will  adversely  affect  the  modeling  process,  and 
introduce  a  bias  in  the  solution  for  the  model  coefficients. 
In  the  transform  domain-,  the  model  can  be  represented  by 
a  polynomial  in  powers  of  z    and  has  therefore  been  referred 
to  as  an  all  zero  model 


N 

A(z)   =   £  a^n)  z 
n=0 


■n 


(2.6) 


In  terms  of  this  transfer  function  relationship,  any  bias 
introduced  in  one  or  more  of  the  model  coefficients  has  the 
effect  of  shifting  the  zero  locations  of  the  model. 

In  summary  a  discussion  of  the  advantages  and  disad- 
vantages of  MA  modeling  is  instructive. 
Advantages : 

1)   The  solution  for  the  model  parameters  involves 
only  linear  equations . 


26 


2)  The  solution  is  unbiased  in  the  presence  of 
additive  noise  on  the  system  output  as  long  as 
the  noise  and  system  input  are  uncorrelated . 

3)  Since  the  model  is  nonrecursive  it  is  always 
stable . 

Disadvantages : 

1)  The  number  of  terms  (N+l)  needed  for  sufficient 
model  accuracy  may  be  quite  large. 

2)  The  solution  of  a  large  system  of  equations  is 
required . 

3)  The  required  correlation  terms  are  usually  not 
known  and  must  be  estimated  by  assuming  ergodi- 
city  and  averaging  in  time.   This  requires  the 
data  to  be  windowed  and  set  to  zero  outside  the 
averaging  interval  in  order  to  maintain  the  even 
symmetry  of  the  autocorrelation  functions. 

4)  The  modeling  process  is  restricted  to  linear 
time  invariant  systems. 

B.   AUTOREGRESSIVE  MODELS 

The  autoregressive  (AR)  model  attempts  to  deal  with  one 
of  the  difficulties  (1)  encountered  in  MA  modeling;  the  need 
for  a  large  number  of  coefficients  to  accurately  describe 
the  model.  [Refs.  2,  4,  11,  19  and  28]   In  AR  modeling, 
which  is  sometimes  referred  to  as  linear  prediction,  a  pre- 
diction error  approach  is  considered  where 

N 
e(k)  =  y(k)  -  £   b(n)  y(k-n)  (2.7a) 

n  =  l 
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This  can  be  written  as 


e(k)  =  y(k)  -  y(k) 


=  y(k)  -  y(k)Tb  (2.7b) 


with 


y(k)   =  [y(k-l)  •••  y(k-N)]T  (2.7c) 


and 


b   =  Cb(l)  • • •  b(N)]T  (2.7d) 


Here  F,  -.    and  F„n  are  assumed  to  be  zero  (this  assumption 
will  be  modified  later  to  allow  a  dependence  on  the  input 
signal  in  the  synthesis  phase)  and  F9n  provides  an  estimate 
of  the  current  value  of  the  system  output  as  a  weighted  sum 
of  N  past  outputs.    The  mean  square  value  of  prediction 
error  as  a  quadratic  function  of  the  weights  (b(n)}  is 
given  by 


E0  =  bTR   b  -  2  bTr    +  R   (0)  (2.8a) 

2 yy-       yy      yy 


and  the  corresponding  MMSE  solution  for  the  weights  is 
given  by 


R  ^rjm  =  r  (2.8b) 

— yy— OPT    — yy 
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with 


E2  .  -    Ryy(0)  ■  ^PTryy  <2'8c) 

mm     yj  JJ 


Using  equation  2.7a,  an  expression  can  be  written 
in  the  transform  domain  for  the  prediction  error  model  which 
accepts  y(k)  as  its  input  and  produces  the  error  sequence 
e(k)  as  its  output. 

rf7l  N 

U|y   =    1    -     E     Mn)z~n    =    B(z)  (2.9) 

n=l 

If  it  is  assumed  that  the  system  input  output  relationship 
can  be  represented  in  transfer  function  form  with 


Y(z)  _  uf.    _     a(0) _  a(0)  ,0  ,  n. 

unry  -  H(z) y   ~  ;    -n    ■  ftz~t  (2-10) 

1-  J]  fa(n)  z 
n=l 


and  that  the  model  parameters  can  be  determined  so  that  B(z) 
=  S(z),  then  the  prediction  error  output  will  be  exactly 
e(k)  =  a(0)  u(k) .   For  this  reason  AR  prediction  error 
modeling  has  often  been  called  inverse  filtering  since  the 
prediction  error  filter  essentially  reverses  the  actions  of 
the  system  (with  the  exception  of  a  gain) .   Since  the 
analysis  model  is  in  this   inverse  form  rather  than  in  the 
direct  form,  the  presence  of  additive  noise  on  the  measure- 
ment of  the  system  output  as  shown  in  Figure  2.3  will  intro- 
duce a  bias  in  the  solution  for  the  model  parameters.   This 
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Figure  2.3.   Autoregressive  prediction  error 
modeling  as  an  inverse  filtering 
process . 


has  the  effect  of  shifting  the  roots  of  B(z)  which  are  es- 
timates of  the  poles  of  the  system  and  is  the  price  paid 
for  the  ability  to  obtain  the  model  solution  from  a  set  of 
linear  equations . 

Thus  far,  only  the  analysis  portion  of  the  AR  modeling 
process  has  been  discussed.   With  the  inverse  filtering 
interpretation  of  the  prediction  error  analysis  model,  a 
reasonable  synthesis  model  is  given  in  transfer  function 
form  as 


H(z)  = 


a(0) 
B(z) 


(2.11) 


with  the  gain  term  set  so  that  the  mean  square  value  of 
a(0)  u(k)  is  the  same  as  that  of  the  prediction  error 
signal.   Thus  it  follows  that 


a(0) 


£{e(kr} 

Hr  ToT" 
uu 


(2.12) 
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Since  the  synthesis  model  is  in  the  form  of  an  all  pole 
filter,  an  appropriate  impulse  response  with  infinite  dura- 
tion might  be  obtained  using  a  low  order  model  (small  N) , 
a  result  that  is  impossible  to  obtain  in  any  finite  order  MA 
model.   This  is  not  to  say  however  that  a  low  or  even  finite 
order  AR  model  will  always  be  an  appropriate  model  for  any 
linear  system.   If  the  transfer  function  representation  for 
a  system  contains  both  zeros  and  poles ,  no  finite  order  AR 
or  MA  model  can  serve  to  exactly  represent  it.   This  fact 
can  be  understood  by  considering  the  form  of  a  geometric 
series 

-Ln      B         .„  -1 


E  (Cz_1)n      for  ICz"1!  <1         (2.13) 


1-Cz"1    n=0 

which  shows  that  a  single  pole  can  be  represented  by  an 
infinite  number  of  zeros  and  visa  versa.   Thus  if  the  sys- 
tem has  a  single  zero,  a  high  order  AR  model  may  be  required 
to  represent  it  with  sufficient  accuracy. 

In  summary,  the  advantages  and  disadvantages  of  AR 
modeling  may  be  listed  as  follows: 

Advantages : 

1)  The  solution  for  the  model  parameters  involves 
only  linear  equations. 

2)  Sometimes  an  appropriate  infinite  impulse  re- 
sponse can  be  obtained  with  a  small  number  of 
parameters  in  the  model. 

3)  Direct  knowledge  or  measurement  of  the  system 
input  is  not  required  for  determing  the  system 
poles .   Only  a  knowledge  of  its  mean  square 
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value  is  necessary  for  determining  the  gain 
factor . 
Disadvantages : 

1)  The  model  is  biased  by  the  presence  of  additive 
noise  on  the  measured  system  output  signal. 

2)  The  number  of  terms  required  for  sufficient 
model  accuracy  may  be  quite  large  if  zeros  are 
present  in  the  system.   If  this  occurs,  the 
inversion  of  a  large  matrix  will  be  required. 

3)  The  required  correlation  terms  are  usually  not 
known  and  must  be  estimated  by  assuming  ergodicity 
and  using  time  averages , 

4)  The  modeling  process  is  restricted  to  linear 
time  invariant  systems. 

This  list  of  advantages  and  disadvantages  is  quite 
similar  to  the  one  compiled  for  MA  models  with  two  notable 
differences-,  the  bias  in  the  model  and  the  absence  of  a 
requirement  for  input  measurements .   This  second  point  is 
significant  in  that  it  has  led  to  the  application  of  AR 
modeling  to  many  problems  where  an  input  signal  is  unmea- 
surable  or  indeed  does  not  exist  including  speech  modeling 
and  spectral  estimation.  [Refs,  2,  5,  12,  15,  21,  32  and  44] 
The  noise  problem  has  restricted  the  process  to  applications 
where  measurements  with  sufficiently  high  signal  to  noise 
ratio  are  available,  making  the  effects  of  the  bias  minimal. 
[Refs.  24  and  43] 
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C.   RECURSIVE  IN  ORDER  SOLUTIONS  FOR  AR  AND  MA  MODELS 

The  preceeding  disucssions  of  the  AR  and  MA  modeling 
problems  tacitly  as summed  an  a  priori  knowledge  of  the  correct 
model  order.   If  this  knowledge  is  not  available  a  reasonable 
approach  for  determining  the  correct  model  order  must  be 
developed   [Ref .  53].  A  commonly  employed  strategy  is  to 
successively  increment  the  model  order  while  observing  the 
MSE  until  further  increases  fail  to  substantially  reduce  the 
MSE .   This  requires  solving  for  a  number  of  different  models 
and  can  be  an  arduous  task  if  equations  (2.8b)  or  (2.3)  are 
employed  directly. 

The  autocorrelation  matrices   appearing  in  the  AR  and 
MA  model  equations  (2.8b)  and  (2.3)  are  highly  structured 
matrices   (both  Toeplitz  and  symmetric)  and  this  fact  can  be 
exploited  to  facilitate  the  solution  of  these  equations. 
The  Levinson  algorithm  [Refs.  9,  10  and  27]  makes  use  of 
this  structure  to  obtain  model  solutions  recursively  in 
order,  that  is,  the  solution  for  the  n-th  order  model  is 
assumed  to  be  known  and  the  solution  for  the  (n+l)-st  order 
model  is  then  obtained  from  it.   In  this  manner  it  is  pos- 
sible to  start  with  a  first  order  AR  or  a  zeroth  order  MA 
solution  given  by  a  single  equation   and  build  up  the 
desired  order  solution.   The  AR  model  will  be  treated  first 
since  it  is  a  special  case  that  simplifies  the  analysis. 

The  simplifications  arise  due  to  the  fact  that  the  r    vector 

-yy 

on  the  right  hand  side  of  equation  (2.8b)  is  made  up  pri- 
marily of  terms  also  appearing  on  the  left  hand  side  in  R 
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Superscripts  in  parenthesis  are  used  to  explicitly  indicate 
the  order  of  the  problem  when  specifically  needed. 
1 .   The  Levinson  Algorithm  For  AR  Modeling 

The  n-th  order  AR  model  solution  of  equation  2.8b 
is  given  by 


R  (n)  b(n)  _    (n) 

-yy   -     -yy 


(2.14) 


The  Levinson  algorithm  assumes  a  relationship  between  the 
n-th  and  (n+l)-st  order  solutions  given  by 


(n+1) 


(n) 


(n) 


k 


(n+1) 


(2.15) 


and  solves  for  the  vector  £    and  the  coefficient  k 
Define  permuted  versions  of  the  vectors  b    and  r  '    as 

-yy 

f    and  £     by  reversing  the  order  of  their  elements. 


(n) 


b(n) 


b(l) 


(n) 


(n) 


(n) 


-yy 


R   (n) 

yy 


R   (1) 

yy 


(2.16) 


Because  of  the  Toeplitz  symmetric  structure  of  the  auto- 
correlation matrix,  equation  (2,14)  can  also  be  written  as 


R  (n) 

-yy 


(n) 


P  (n) 

-yy 


(2.17) 
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and  this  relationship  is  essential  in  the  development  of  the 
Levinson  algorithm.   (To  apply  the  algorithm  therefore  when 
time  averaged  estimates  of  the  needed  correlations  are  used, 
the  data  must  be  windowed  prior  to  averaging  to  maintain  the 
even  symmetry  in  the  autocorrelation  function  estimates  and 
produce  the  required  structure  in  the  autocorrelation  matrix.) 
Making  use  of  equation  (2.15),  in  the  (n+l)-st  order  version 


of  equation  (2.14),  and  using  the  relationship  of  (2.17)  to 

,    p     (n)    .  .  (n+1)      ,. 

solve  for  £    and  k       results  in 

% 

(n)     fi-Cn).  (n+1)  ,  _  .  _  , 

£    =  -f   k  ( 2 . 18a) 

and 

,n+1,    R   (n+1)  -  p(n)Tb(n) 
k(n+1)  =  J& =^— - (2.18b) 

R   (0)  ~  b(n)  r  (n) 

yy        -     _yy 

Therefore,  in  using  equations  2,15  and  2.18  to  obtain  b 
from  b     via  the  Levinson  algorithm  only  one  new  piece  of 
information,  k      ,  need  be  calculated.   The  denominator 
of  equation  (2.18b)  can  also  be  recognized  from  equation 
(2,8c)  as  the  MMSE  for  the  n-th  order  AR  model  E«    ,  and 
thus  there  is  little  concern  over  the  possibility  of  it  being 
zero.   If  the  n-th  order  solution  produces  a  perfect  predic- 
tion (zero  MSE)  there  is  no  point  in  trying  to  find  a  better 
prediction  by  increasing  the  order  to  n+1.   The  evaluation  of 
equation  (2.18b)  can  be  further  simplified  by  observing  that 
the  MSE  also  follows  a  recursion  from  one  order  to  the  next 
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given  by 


(n+l)  =  E2(n)  Cl  _   k(n+l)  } 


(2.19) 


making  it  unnecessary  to  evaluate  the  denominator  at  each 
value  of  n.   (Details  of  this  derivation  are  omitted  here 
but  included  in  Appendix  A  in  the  derivation  of  the  more 
general  multichannel  Levinson  algorithm. )   This  relation  for 
the  propagation  of  mean  square  prediction  error  also  leads 
to  the  restriction  that  k      must  be  bounded  in  magnitude 
by  unity. 

2 .   The  Levinson  Algorithm  For  MA  Modeling 

Next  consider  the  n-th  order  MA  model  given  by 


R  (n)   *+ 
u  u 


(n) 


=  £  + 


(n) 

f 
u  y 


(2.20) 


and  again,  assume  a  relationship  between  the  (n+l)-st  order 
and  n-th  order  solutions  given  by 


(n+l) 


(n) 


Y 


(n+l) 


(n+l) 


(2.21) 


Notice  that  in  this  n-th  order  problem,  R    +  is  actually  a 

u  u 

n+l  by  n+l  matrix  and  could  be  written  as  R       .   Define 
J  — uu 
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(n+1) 


-uu 


R   (1) 

uu 


R   (n+1) 
uu 


(n+1) 


-uu 


R   (n+1) 
uu 


R   (1) 
uu 


(2.22) 


Using  equations  (2,21)  and  (2,22)  in  the  n+1  order  MA  model 
equation  it  follows  that 


(n+1)      .(n+1)  (n+1) 
Y       =  «-  f      g 


(2.23a) 


where  f      is  defined  in  a  manner  similar  to  (2.16)  and  is 
comprised  of  the  coefficients  that  arise  in  an  (n+l)-st 
order  autoregression  on  the  input  signal  u(k) . 
Therefore  to  obtain  a  moving  average  model  relating  the 
system  output  signal  y(k)  to  the  input  signal  u(k),  an 
autoregressive  model  for  the  system  input  must  first  be 
solved.   Furthermore, 


(n+1) 


id  (    xt  ^     (n+1)   + 
R   (n+l)-p         a 

uy — uu — 

R   (0)-p  <n+l)Tf(n+l) 

uu     --uu      — 


(2.23b) 


and  the  denominator  of  equation  (2.23b)  is  the  MMSE  in  the 
(n+l)-st  order  autoregression  on  the  signal  u(k) . 

It  is  significant  to  note  that  in  applying  the 
Levinson  algorithm  to  find  a  given  order  AR  or  MA  model,  all 
lower  order  models  along  with  their  MSE ' s  are  obtained. 
Also,  intermediate  quantities  emerge  (the{k    }  in  the  AR 
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model  and  the  {k    }  and  {g    }  in  the  MA  model)  which  fully 
characterize  the  models  and  could  be  used  as  an  alternative 
to  the  {a(n)}  or  {b(n)}  coefficients.   This  point  will  be 
developed  further  in  subsequent  sections. 

D.   LATTICE  FORM  AR  AND  MA  MODELS 

The  Levinson  algorithm  derived  in  the  previous  section 
can  be  used  to  derive  lattice  structures  to  implement  the 
MA  model  and  the  AR  analysis  and  synthesis  models  as  alter- 
natives to  a  tapped  delay  line  type  of  implementation  using 
the  coefficients  a(n)  or  b(n)  directly.  [Refs.  29,  30,  32 
and  3  3] 

1.   The  AR  Modeling  Lattice  Structures 

From  the  relationship  between  the  (n+l)-st  and  n-th 
order  solutions  to  the  AR  modeling  problem  determined  in 
equations  (2.15)  and  (2.18a)  it  follows  that  the  transfer 
function  of  the  prediction  error  model  can  be  written  re- 
cursively in  order  as 

9 

BCn+1)(z)  =  B(n?z)  -  k(n+1)z-n-1B(n)(Z-1)      (2.24) 


Defining  a  new  transfer  function 


1. 


B"(n)(z)  =  z-nB(n)(z-1)  (2.25a) 


equation  (2.24)  can  be  written  as 


D(n+1),  >.    r,(n),  *    ,  (n+1)  -l~-(n),  s         ,  0  oc-u>> 
B      (z)  =  B    (z)  -  k      z   B    (z)         (2.25b) 
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and  an  expression  can  also  be  written  for  B      (z)  recur- 
sively in  order  by  rewriting  equation  (2.25a)  for  order 
(n+1)  and  substituting  equation  (2,24)  yielding. 


B(n+1)(z)  =  z-¥n)(z)  -  k(n+1)B(n)(z)         (2.25c) 


As  discussed  earlier  in  connection  with  equation  (2.9), 
3    (z)  describes  the  n-th  order  prediction  error  model  and 
when  its  input  is  the  system  output  Y(z),  it  produces  the 
n-th  order  prediction  error  signal. 


E(n)(z)  =  3(n)(z)  Y(z)  (2.26) 


In  the  time  domain  this  signal  can  be  interpreted  as  the 
error  in  predicting  y(k)  forward  in  time  from  a 
weighted  combination  of  the  n  past  values  (y(k-l) *  * •y(k-n) } 
To  understand  the  significance  of  B    (z)  consider  the  out- 
put signal  when  this  model  is  excited  by  Y(z). 


E(n)(z)  =  B(n)(z)Y(z) 


=  z~n[l  -  £   b(n)(i)z+i]Y(z)  (2.27) 

Lai 

In  the  time  domain,  e  n  (k)  can  be  interpreted  as  the  error 
in  predicting  y(k-n)  backward  in  time  from  a  weighted 
combination  of  the  future  signals  {y(k-n+D * ■ ■ y (k) } .   These 
n-th  order  forward  and  backward  prediction  processes  at  time 
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k   are  illustrated  in  Figure  2.4.   Henceforth,  an  overbar 
will  always  be  used  to  denote  quantaties  associated  with 
backward  in  time  predictions. 


y 


(k)  f 


Forward 
Prediction 


Backward 
Prediction 


Figure  2.4-.   Forward  and  Backward  Prediction  Error 
Filtering . 


From  equations  (2.25b)  and  (2.25c)  equations  can  be  written 
recursively  in  order  for  these  forward  and  backward  predic- 
tion error  sequences  as: 


Q(n+l)r,v  _   (n),.  n    ,  (n+l)-(n) ,.  .. 

e      (k)-e    (k)  -  k     e    (k-1) 


(2.28a) 


■(n+1)(k)  =  e"(n)(k-l)  -  k(n+1)e(n)(k) 


(2,28b) 


Noting  that  the  prediction  error  for  a  zero-order  AR  pre- 
dictor of  y(k)  (or  no  predictor  at  all)  is  just  the  signal 
y(k)  itself, 


Co) ,,  %    —  ( o )  / ,  <.     •  ,  >. 
e    (k)  =  e    (k)  =  y(k) 


(2.28c) 


the  prediction  error  filter  can  be  drawn  in  lattice  form  as 
shown  in  Figure  2.5  for  a  second  order  model. 


y(k) 


e(1)(k) 


(2) 


(k) 


i(2)(k) 


Figure  2.5.   Lattice  Form  Of  A  Second  Order  Pre- 
diction Error  Model. 


This  structure  has  many  interesting  properties , 
among  the  most  important  of  which  is  the  successive  de- 
coupling property.   In  going  from  one  order  AR  model  to  the 
next,  all  of  the  previously  determined  transfer  function 
coefficients  (b(n)}  will  generally  change.   The  Levinson 
algorithm  shows  however  that  only  one  new  piece  of  infor- 
mation is  needed  to  obtain  the  optimum  (n+l)-st  order 
solution  from  the  optimum  n-th  order  solution  (see  equation 
(2.24)),   In  terms  of  the  lattice  filter  of  Figure  2 . 4  tb 


U  1 


means  that  given  the  optimum  n-th  order  model  in  lattice 
form,  one  need  only  add  another  stage  to  the  structure, 
setting  the  coefficient  of  that  stage  k  *     to  minimize 
the  mean  square  value  of  e      (k) .   Nothing  in  the  first 
n  stages  need  be  changed.   The  overall  high  order  minimi- 
zation problem  is  in  this  fashion  decomposed  into  a  se- 
quence of  first  order  minimizations,  one  at  each  lattice 
stage . 

Another  important  property  of  the  lattice  which  can 
be  proven  and  will  be  of  use  later  is  the  orthonogalization 
of  the  backward  prediction  error  sequence  [Ref.  32]  which 
states  that 

0  i*j 


e{e"(i)(k)e(j)(k)}  = 


(2.29) 


-p  Ci) 

E2  i  =  ] 


Thus  it  is  seen  that  a  set  of  orthogonal  signals  (the  back- 
ward prediction  errors  at  the  various  stages)  are  generated 
as  a  by-product  of  the  lattice  model. 

As  a  consequence  of  the  successive  decoupling  pro- 
perty of  the  lattice,  a  number  of  alternatives  to  equation 
(2,18b)  for  determining  the  lattice  coefficients  can  be 
found.   The  most  obvious  method  is  to  set  k      to 
explicitly  minimize  the  mean  square  value  of  forward  pre- 
diction error  in  equation  (2,28a)  at  the  (n+l)-st  order 
stage  given  the  best  lattice  of  order  n.   This  is  termed 
the  forward  method  and  is  denoted  by  a  subscript  F  on  the 
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lattice    coefficients.      The    resulting    solution    is    given   by 


u.    (n+1)         £{e(n)(k)e(n)(k-l)}  fn    ... 

F ;-(n)M     ..2,  C2.30J 

s le        (k-1)    } 


Alternately,  the  mean  square  value  of  the  backward  predic- 
tion error  signal  in  equation  (2.28b)  could  be  minimized  to 
determine  the  coefficient  resulting  in  the  backward  method 
solution  given  by 


v  (n+1)    g(e(n)(k)e"(n)(k-l)}  ro  ... 

B       "     r  (n)rv,2\  (2.31) 

£  le    (k)  } 


Since,  however,  the  forward  and  backward  prediction  error 
transfer  functions  are  given  by  B    (z)  and  z"    B    (z~  ), 
it  follows  that 


|B(n)(z)|  =   |B(n)(z)|  (2.32) 


and  since  they  are  both  driven  by  the  same  input,  Y(z),  the 
mean  square  values  of  both  the  forward  and  backward  predic- 
tion error  signals  at  a  given  stage  are  the  same  making 
equations  (2.30)  and  (2.31)  equivalent.   It  is  also  possible 
to  show  that  they  are  equivalent  to  equation  (2.18b). 
Recognizing  that  the  required  expectations  will  eventually 
have  to  be  estimated  by  using  time  averages,  these  two 
methods  for  calculating  k      will  not  in  general  be 
exactly  equivalent  and  it  might  be  preferable  to  use  the 
arithmetic  mean  of  the  mean  square  values  of  forward  and 


backward  prediction  error  as  a  cost  function 


,  r  r  (n+1)  ,,  v  2    —  (n+1)  ,,  v2n  eo  o->  % 

^Lete      (k)+e      (k)  }J  (2.3  3a) 


This  leads  to  a  third  method  derived  by  Burg  [Ref .  5]  in 
his  work  on  maximum  entropy  spectral  analysis  and  given  by 


,  (n+1)    2s{e(n)(k)e~(n)(k-l)  }  f0    ,-.  , 
£  {e    (k)  }  +  s  {e    (k-1)  } 

u~*-    4.v  4-  i  (n+1)  .   ..   .      .         -  .  (n+1)    ,  .  (n+1) 

Notice  that  kD~     is  the  harmonic  mean  of  k„  and  kn 

Db                                              r  D 

A  fourth  method  due  to  Itakaura  and  Saito  [Ref.  20]  can  also 
be  derived  which  results  in 


k(n+l)  s  £(e(n)(k)e(n)(k-l)}-  (2.34) 

V£{e(n)(k)2}£{e(n)(k-1)2}' 


and  kTC,     is  simply  the  geometric  mean  of  the  forward  and 
backward  coefficients. 

Since  equation  (2.34)  is  of  the  form  of  a  normalized 
correlation  kT<-,  will  always  be  bounded  by  unity  in  magnitude 
as  required  by  equation  (2.19).   Furthermore  since 

Harmonic  Mean     <     Geometric  Mean 


it  follows  that  kD~     will  be  similarly  bounded.   These 

DO 

bounds  are  significant  since  Markel  [Ref.  32]  has  shown  that 
|k    | <1  is  a  necessary  and  sufficient  condition  to  ensure 


that  the  roots  of  B    (z)  be  within  the  unit  circle  guaran- 
teeing the  stability  of  the  n-th  order  all  pole  model.   No 
such  guarantees  of  model  stability  exist  when  the  forward 
or  backward  solution  methods  of  equations  (2.30)  and  (2.31) 
are  used  with  the  correlation  estimates  obtained  by  aver- 
aging for  finite  time  intervals. 

To  determine  the  AR  synthesis  model  in  lattice  form 
it  is  only  necessary  to  rewrite  equation  (2.28a)  as 


(n),,  N     (n+1),.  ,     (n+D— (n),,  ,  x  /0  0  c  >. 

e    (k)  =  e      (k)  +  k      e    (k-1)  (2.35) 


Together  with  equations  (2.28b)  and  (2.28c),  this  describes 
the  structure  shown  in  Figure  2.6  for  a  second  order  case 
and  when  it  is  driven  by  the  second  order  prediction  error 
signal,  it  will  reconstruct  y(k)  exactly.   Thus  it  imple- 
ments the  transfer  function  — t-k-t or,  in  general,  when 

1         B^J(z) 

stages  are  used  --/yr\ .  .   Recognizing  that  if  the  pre-  . 

BUJ;(z) 
diction  error  model  is  an  accurate  model  of  the  system 

denominator  polynomial,  e    (k) =a(o)u(k) ,  this  input  signal 
is  used  in  the  synthesis  model.   Because  of  analogies  with 
transmission  lines  and  wave  propagation  models,  the  lattice 
coefficients  have  been  referred  to  as  reflection  coefficients 
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(2),,, 
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e    (k) 


Figure  2.6.   Lattice  Form  Of  The  All  Pole  Synthesis 
Model  For  The  Second  Order  Case. 


2 .   The  MA  Modeling  Lattice  Structure 

A  similar  lattice  form  is  applicable  to  the  MA 
modeling  problem.   From  equations  (2.21)  and  (2.23a)  the 
transfer  function  of  the  MA  model  can  be  written  recursively 
in  order  as 


A(n+1),  v    A(n).  ,     (n+l)Fh+l,  x 
A      (z)  =  A    (z)  +  s      B    (z) 


(2.36) 


where,  as  discussed  in  connection  with  equation  (2.23a), 
B    (z)  is  the  backward  prediction  error  transfer  function 
for  an  autoregressive  model  of  the  input  signal  u(k) . 
Multiplying  both  sides  of  equation  (2,36)  by  U(z)  and  trans- 
forming into  the  time  domain  it  follows  that 


*(n+l),,  .    A(n)M,     (n+l)-n+lrvs 
y      (k)  =  y    (k)  +  g     e    (k) 


(2.37) 
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f  1 

where  e   x(k)  is  the  backward  prediction  error  signal  from 
the  autoregression  on  the  input  signal  u(k) ,  and  can  be 
obtained  by  operating  a  prediction  error  lattice  with  u(k) 
as  its  input.   Then  with  the  additional  term  in  equation 
(2.37)  the  lattice  form  of  the  MA  model  can  be  drawn  as 
shown  in  Figure  2.7. 
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y    (k) 
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Figure  2.7.   Lattice  Form  Of  The  MA  Model  (Second 
Order  Case) . 


It  was  stated  earlier  that  the  AR  prediction  error 
lattice,  as  a  by-product,  forms  a  set  of  orthogonal  or 
uncorrelated  backward  prediction  error  signals  from  its 
input.   Here  in  the  MA  model,  these  orthogonal  signals  are 
linearly  combined  to  form  the  MA  estimate  of  the  system 
output.   If  the  input  signal  u(k)  is  a  white  process,  an 
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examination  of  any  of  the  solution  methods  previously  dis- 
cussed will  show  that  all  of  the  {k    }  lattice  coefficients 
will  be  zero  since  the  delayed  samples  of  a  while  process 
are  already  orthogonal.   Otherwise  the  {k    }  lattice 
coefficients  will  be  set  to  orthogonalize  the  backward  pre- 
diction error  signals.   As  a  consequence  of  this,  the 
weighting  coefficients  g     can  be  set  independently  of 
each  other;  that  is  g     can  be  set  to  minimize 


£{e(n)(k)2}  =  s{[y(k)  -  y(n(!k)]2}  (2.38) 


given  the  best  prediction  of  order  n-1,  y    (k)  .   This  results 
in  an  alternative  expression  to  equation  (2.23b)  for  g 
given  by 

(n)    sfj^OO  g(n)(k)}  _  e(y(k)  g(n)(k)) 
g    =    £{i(»>Ck)2}         £{5(n)(k)2}     (2-39> 

Here  e    (k)  is  the  error  between  the  system  output  and  its 
o  j  r 

n-th  order  MA  estimate. 


E,   MULTICHANNEL  AR  AND  MA  MODELING 

Both  the  AR  and  MA  modeling  problems  previously  dis- 
cussed, as  well  as  their  solution  via  the  Levinson  recursion 
and  lattice  filter  methods,  can  be  generalized  to  the 
multichannel  case  by  replacing  the  various  signals  with 
signal  vectors  and  replacing  the  weighting  coefficients 
with  appropriately  dimensioned  matrices  of  coefficients. 
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A  discussion  of  this  appears  in  Robinson  [Ref.  4-5],   The 
equations  that  describe  the  AR  and  MA  models  and  their  MMSE 
solutions  are  repeated  here  for  convenience. 


N 


AR 


y(k)  =  £  Mi)  y(k-i) 


i=l 


(2.40a) 


R   (0)     R   (-1)  , 

yy      yy 


R   (1)     R   (0) 

yy      yy 


R   (N-l)   R   (N-2) 

yy      yy 
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R   (2-N) 

yy 


R   (0) 
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b(N) 


R   (1) 
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R   (2) 
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R   (N) 
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(2.40b) 


MA 


N 
y(k)  =  £  a(i)  u(k-i) 
i=0 


(2.41a) 


R   (0)     R   (-1)  , 

uu        uu 


R   (1)     R   (0) 

uu        uu 
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uu        uu 


,  .  R   (-N) 
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R   (0) 
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R   (N) 

uy 


(2,41b) 


In  a  multichannel  generalization,  y(k)  becomes  a  vector  of 
Qn  output  signals,  u(k)  becomes  a  vector  of  Q.  input  signals, 
b(i)  becomes  a  square  matrix  of  Qn  x  Qn  coefficients  and  a(i) 


0 


0 


becomes  a  Qn  x  Q.  matrix  of  coefficients  so  that  the  N-th 
x0    i 


order  multichannel  model  equations  can  be  written  as 

AR        y(k)  =  £   b(n)  y(k-^n)  (2.42a) 

n  =  l 


N 
MA        y(k)  =  £  a(n)  u(k-n)  (2.42b) 

n=0 


The  equations  for  the  MMSE  solutions  (2.40b)  and  (2.41b) 

generalize  directly  as  well  by  replacing  each  correlation 

coefficient  R   (n)  by  matrices   of  correlation  coefficients 
vw     J 

given  by 


R   (n)  =  e{v(k)  w(k+n)T}  (2.43) 

— vw        —    — 


where  y_(k)  and  w(k)  are  signal  vectors.   This  causes  the 
overall  correlation  matrices   to  take  on  a  block  Toeplitz 
structure.   The  transfer  function  relationships  of  the  AR 
prediction  error  model  and  the  MA  model  take  the  form  of 
matrix  polynomials 


B(z)  =  Ml)  z"1  +  '••  +  b(N)  z~N  (2.44a) 


A(z)  =  a(0)  +  a(l)  z~X  +  •••  +  a(N)  z~N        (2.44b) 


so  that 


E(z)  =  [I  -  B(z)]  Y(z)  (2.44c) 


Y(z)  =  A(z)  U(z) 


(2.44d) 


where  E(z)  is  the  transform  of  the  multichannel  AR  prediction 
error  vector  e(k)  =  y(k)  -  y(k)  .   Alternately,  equations 
(2.4-M-a)  and  (2.4-4-b)  can  be  written  as  single  matrices   whose 
entries  are  polynomials  in  z  rather  than  scalars.   B(z)  is 
of  necessity  a  Q   x  QQ  square  matrix  polynomial  while  the 
dimensions  of  A(z)  (Q   x  Q.)  depend  upon  the  number  of 
inputs  and  outputs  which  need  not  be  the  same. 

In  the  single  channel  AR  problem,  B(z)  provides  the 
transfer  function  of  the  prediction  error,  or  inverse  fil- 
ter, and  must  be  inverted  to  obtain  the  all  pole  synthesis 
filter.   The  stability  of  the  synthesis  model  therefore 
depends  upon  the  roots  of  this  polynomial.   The  matrix 
polynomial  [I-B(z)]  in  the  multichannel  AR  problem  is,  in 
like  fashion,  an  inverse  filter  representation  and  must  be 
inverted  to  obtain  the  synthesis  model.   This  inversion  of 
a  matrix  with  polynomial  entries  is  defined  in  the  same 
manner  as  the  inversion  of  a  square  matrix  with  scaler 
entries.   To  see  what  this  inverse  matrix  polynomial  looks 
like  consider  as  an  example  a  two  channel  autoregression 
with  a  prediction  error  filter  given  by 


[I-B(z)]  = 


1  -  bn(z) 


~b21(z) 


~b12(z) 


1  -  b22(z) 
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Applying  Cramer's  rule,  the  inverse  matrix  polynomial  is 
written  as 


Cl-B(z)] 


-1 


det[I-B(z)] 


1-b   (z)    b21(z) 


b12(z) 


l-bxl(z) 


and  it  is  apparent  that  the  stability  of  the  multichannel 
synthesis  model  is  dependent  upon  the  locations  of  the  zeros 
of  the  polynomial  det[I~B(z)], 

This  straightforward  generalization  of  the  AR  and  MA 
models  is  what  has  customarily  been  used  in  the  literature 
to  develop  the  multichannel  models  and  similarly  derived 
generalizations  of  the  Levinson  algorithm  to  solve  them 
recursively  in  order  are  available  as  well.  [Refs.  57  and  61] 
The  multichannel  AR  and  MA  modeling  problems  and  their 
solutions  via  the  Levinson  algorithm  can  however  be  recast 
as  shown  in  Appendix  A  to  make  them  compatible  with  the  form 
of  other  linear  and  nonlinear  modeling  problems .   To  avoid 
confusion  later  in  the  application  of  the  results,  the 
derivations  in  Appendix  A  have  been  carried  out  in  a  generic 
form  with  x  and  d  used  to  represent  some  of  the  signals  and 
coefficients  respectively.   The  symbols  u,  y,  a  and  b  have 
been  reserved  to  denote  system  input,  output  and  weighting 
coefficients . 

Equations  (A. 7)  and  (A. 26)  provide  the  MMSE  solutions  to 
the  multichannel  AR  and  MA  modeling  problems  in  forms 
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different  than  (although  entirely  equivalent  to)  those 
resulting  from  the  straightforward  generalizations  of 
equations  (2,40b)  and  (2,41b),   The  multichannel  generali- 
zation of  the  Levinson  algorithm  derived  in  Appendix  A 
can,  with  one  exception,  be  seen  as  a  matrix  algebra 
generalization  of  the  single  channel  algorithm  and  as,  one 
would  suspect,  the  single  channel  algorithm  results  as  a 
special  case  of  Appendix  A.   The  one  exception  is  that  in 
the  multichannel  case,  the  n-th  order  forward  and  backward 
prediction  error  models  are   not   simply   related  to  one 
another.   The  single  channel  AR  backward  predictor  is  given 
by  z   B    (z   )  but  in  the  multichannel  case  the  backward 
prediction  is  not  z"  [I-  £)    (z~  )]. 

•Because  of  this,  two  reflection  coefficient  matrices 
K  and  K  are  required  at  each  stage  in  the  recursion  to 
relate  the  n-th  and  (n+l)-st  order  solutions  rather  than 
just  one  as  in  the  single  channel  problem.   Also,  in  the 
single  channel  case,  the  fact  that  |B    (z)|  =  |B    (z)| 
and  therefore  e {e(n) (k) 2 }=c {e(n) (k) 2 }  made  possible  the  de- 
rivation of  the  Burg  algorithm  and  the  Itakaura-Saito  al- 
gorithm, which  ensured  the  magnitude  of  the  reflection 
coefficient  was  bounded  by  unity  and  that  equation  (2.19) 
would  result  in  nonnegative  values  of  MSE ,   In  the  multi- 
channel algorithm  however,  the  forward  and  backward  predic- 
tion error  covariance  matrices   and  their  traces  are  not  the 
same  (except  for  P     and  P     and  as  a  result,  straightfor- 
ward generalizations  of  the  Burg  and  Itakaura-Saito  algorit 
are  not  possible. 
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Consequently  with  correlation  estimates  obtained  by  averaging 
over  finite  time  intervals ,  there  are  no  guarantees  that 
equations  (A. 18a)  and  (A. 19b)  maintain  the  positive  definite- 
ness  of  the  prediction  error  covariance  matrices. 

Multichannel  generalizations  of  Burg's  algorithm  due  to 
Nuttal  [Refs.  40  and  41],  Morf  [Ref.  39],  and  Strand 
[Ref.  50]  which  guarantee  the  positivity  of  the  covariance 
matrices  -  are  available  but  are  not  explored  here  because 
of  their  complexity  and  because  they  would  take  the  dis- 
cussion too  far  afield. 

The  relations  of  equations  (A. 20)  and  (A. 30)  which  are 
repeated  here  for  convenience  permit  the  construction  of 
the  multichannel  AR  analysis  and  synthesis  lattice 
structures  and  the  MA  lattice  structure.   For  the  multi- 
channel AR  model, 


T 
(n+1),,  >.     (n),,  >.    ,,(n+l)  —  (n),,  -,  >.         ,n    „  c  >, 
e      (k)  =  e    (k)  -  K      e    (k-1)         (2.45a) 


T 
-(n+lK.  .    —  (n),,  ,.    ^(n+l)   (n),,  s         ,  0  llC.  . 
e      (k)  =  e    (k-1)  -  K       e    (k)         (2.45b) 


e(0)(k)    =  ~e(0)(k)  =  x(k)  (2.45c) 


and  the  corresponding  prediction  error  lattice  is  shown  in 
Figure  2.8. 
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x(k) 


(OK.  , 

e    (k) 


e    (k) 


e(0)(k) 


e(1)(k) 


e    (k) 


e    (k) 


Figure  2.8 


Multichannel  AR  prediction  error  lattice 
structure  for  a  second  order  model.  All 
signal  paths  are  vector  paths  and  summa- 
tions are  vector  summations.  The  multi- 
plications indicate  premultiplication  of 
the  signal  vector  by  the  specified  coef- 
ficient matrix. 


To  obatin  the  multichannel  AR  synthesis  lattice,  equation 
(2.45a)  can  simply  be  rewritten  as 


(n),,  ^     (n+1),.  s    ^(n+1)  — (n)M  ,% 
e    (k)  =  e      (k)  +  K      e    (k-1) 


(2,46) 


resulting  in  the  structure  shown  in  Figure  2.9 


(2>^ 
e    (k) 


i(2)(k) 


e    (k) 


e    (k) 


e(0)(k)=x(k) 


-CO),.  , 

e    (k) 


.  Figure  2  .  9 


Multichannel  AR  synthesis  lattice 
structure  for  a  second  order  model 
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For  the  multichannel  MA  model,  equations  (2.M-5)  and 


"(n+l),,.    *<n)n  *.p(n+l)   ~(n+l)M> 

y      (k)  =  y    (k)+G       e      (k) 


(2.47) 


describe  the  lattice  structure  shown  in  Figure  2.10. 


n(k)  d 


y    (k) 


Figure  2.10.   Multichannel  MA  lattice  structure 
for  a  second  order  model. 


As  in  the  single  channel  case,  the  multichannel  predic- 
tion error  lattice  exhibits  the  successive  decoupling  pro- 
perty and  orthogonalizes  the  backward  prediction  errors  at 
the  various  stages  so  that 


e(i(i)Ck)  e(^k)T}  = 


0 


p(i) 


i*j 


1=3 


(2.48) 
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As  a  consequence  of  the  successive  decoupling,  the  forward 
and  backward  reflection  coefficient  matrices   at  the  (n+l)-st 
stage  can  be  set  to  minimize  the  trace  of  the  forward  and 
backward  prediction  error  covariance  matrices   respectively, 
given  the  best  lattice  of  order  n.   This  provides  alterna- 
tives to  equations  (A. 17)  for  calculating  K  and  K  and  also 
generalizes  the  forward  and  backward  single  channel  solutions 
discussed  previously,  resulting  in 


K(n+1)  __    p(n)-1  A(n)T  (2.49a) 


K(n+1)  =  P(nrl  A(n)  (2.49b) 


where 


A(n)  s  e{eCn)(k)  e(n)(k^l)T}  (2.49c) 


It  is  also  possible  to  show  that  these  relationships  are 
entirely  equivalent  to  equation   (A. 17),   In  the  multi- 
channel MA  lattice,  the  orthogonality  of  the  backward  pre- 
diction error  signals  also  allows  the  G  matrices   to  be 
calculated  in  succession  providing  an  alternative  to  equation 
(A. 28b)  and  generalizing  the  single  channel  solution  given 
by  equation  (2.39).   Setting  G     to  minimize  the  trace  of 
the  error  covariance  matrix 


■n  (n)     r   (n),,  n    (n),,  J,  /  0  c  n  x 

PQ     =  £{eQ    (k)  eQ    (k)-  }  (2.50a) 
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where 


Sn(n)(k)  =  y(k)  «  y(n)(k)  (2.50b) 


results  in  a  solution  given  by 


Q(n)  s  F(n)'1  £{-(n)(k)  y(k)T}  (2.50c) 


Another  important  characteristic  of  the  lattice  solutions 
to  the  AR  and  MA  modeling  problems  given  by  equations  (2.M-9) 
and  (2.50)  and  their  single  channel  counterparts  is  that  they 
do  not  impose  any  requirements  to  window  the  data  when  finite 
time  averages  are  used  to  estimate  correlations.   The  auto- 
correlation function  of  a  signal  is  inherently  an  even  func- 
tion so  that  R   (n)  =  R   (-n),   This  fact  is  responsible  for 
vv        vv  r 

much  of  the  special  structure  of  the  correlation  matrices 
that  appear  in  the  model  solution  equations,  and  was  also 
used  in  the  derivation  of  the  Levinson  algorithm.   In  esti- 
mating the  autocorrelation  function  via  time  averaging  over 
a  finite  interval,  a  window  function  that  is  nonzero  over 
only  a  given  interval  must  be  applied  to  the  data  to  retain 
this  even  symmetry  property  in  the  estimate.   If  the  data  is 
not  set  to  zero  outside  a  given  interval,  end  effects  will 

destroy  the  symmetry  so  that  R   (n)  i    R   (-n)  as  deDicted 

vv        vv 

in  Figure  2.11. 


RQ 


v(t) 


v(t-n) 


v(t+n) 


Averaging  Interval 


Figure  2.11.   Time  averaging  to  estimate  correlations 
without  windowing. 


In  the  lattice  solutions  of  equation   (2.49)  however,  there 
is  no  requirement  to  make  such  an  artificial  assumption 
about  the  data  (that  it  is  zero  outside  some  interval). 

These  properties  of  the  lattice  solution  methods  were 
responsible  for  their  initial  use  by  Burg  in  his  work  on 
maximum  entropy   spectral  analysis  [Ref .  5]  and  have  con- 
tinued to  generate  interest  in  the  application  of  lattice 
methods  to  other  types  of  modeling  problems. 
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F.   ADAPTIVE  MODELING 

The  LMS  adaptive  algorithm  provides  a  well  known  alter- 
native method  for  obtaining  the  solution  to  the  AR  or  MA 
modeling  problems  which  does  not  require  the  estimation  of 
correlations  or  the  inversion  of  a  matrix   [Refs.  58,  59 
and  6  0].  This  algorithm  updates  an  estimate  of  the  model 
solution  vector  at  each  time  instant  by  an  amount  propor- 
tional to  the  negative  of  the  instantaneous  gradient  of  the 
cost  function;  i.e.  ,  in  a  MA  model, 


a+(k+l)  =  a+(k)  -  u  7+(k)  (2,51a) 


where  u  is  a  proportionality  constant  or  adaptive  gain. 
Since  the  actual  gradient  is  usually  not  known,  it  is 
approximated  by  using  the  square  of  a  single  sample  of  the 
error  as  an  estimate  of  the  MSE  so  that 


V+(k)  =  LeC£>! 
da 


=  -2  u+(k)  e(k)       (2.51b) 


a  =a  (k) 


and 


a+(k+l)  =  a+(k)+2u  u+(k)  e(k)  (2.51c) 


In  each  of  the  models  considered  here,  the  cost  function 
(MSE  or  trace  P)  is  a  quadratic  function  of  the  model  weights 
and  defines  a  concave  hyperparaboled  with  a  unique  minimum. 
The  functioning  of  the  LMS  algorithm  under  these  conditions 
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can  easily  be  understood  by  considering  the  scalar  case  of 
equation  (2.51)  illustrated  in  Figure  2,12. 

As  this  illustration  shows ,  the  algorithm  can  actually 
diverge  for  too  large  a  value  of  adaptive  gain.   The  rate 
of  convergence  is  also  dependent  on  "uhe  size  of  the  adaptive 
gain.   Widrow  has  shown  that  for  stability,  the  gain  must  be 
set  so  that 


0  <  \i    <     rr—: —  (2.52a) 

max 


While,  in  the  mean,  the  weight  vector  converges  with  an 
exponential  time  constant  of 


x  =  yr-rr (2.52b) 

2uA  . 
mm 


where  A  .   and  X  are  the  smallest  and  largest  eigenvalues 

mm      max  &      to 

of  the  input  autocorrelation  matrix  R    L.   From  the  stand- 

u  u 
point  of  stability,  \s    should  be  made  relatively  small  but 

for  rapid  convergence,  equation  (2.52b)  dictates  that  it 

should  be  large.   Setting 


u  =  r—  (2.53a) 

max 


where  a  is  a  normalized  gain  and  0  <  a  <  1,  equation  (2.52b) 
becomes 

3  1    max  (  0  c 0 ,  x 

T  -  7s  jrr  (2.53b) 

mm 


MSE 


MMSE 


k=0 


a(0) 


opt        a(0) -^ 
a)  Small  adaptive  gain;  steady  convergence  toward  the  solution 

MSE 


t 


k=l 


k=3 


k=0 


b)  Intermediate  adaptive  gain  5  oscilalory  convergence  toward 
the  solution 


t 

MSE 


k=2 


a(0)  _*. 

c)  Large  adaptive  gain;  divergence  away  from  the  solution 


Figure  2.12.   Behavior  of  the  LMS  adaptive  algorithm 
for  various  adaptive  gains. 


and  for  a  wide  disparity  between  the  largest  and  smallest 

eigenvalues  (A  .   <<  X    ),  convergence  will  be  quite  slow. 

°  mm     max         °  ^ 

This  consideration  becomes  increasingly  important  when  high 
order  model  solutions  are  obtained  adaptively  since  the 
dimensonality  of  the  input  autocorrelation  matrix  will  be 
high  and  the  possibility  of  a  wide  eigenvalue  disparity 
greater . 

These  same  adaptive  techniques  have  been  applied  by 
Griffiths  [Refs.  16,  17,  18  and  31],  to  the  AR  and  MA 
lattice  filter  structures  derived  in  the  last  section.   The 
key  difference  between  the  conventional  adaptive  filter  and 
the  adaptive  lattice  is  that  in  the  lattice,  the  adaptation 
is  carried  out  on  a  stage  by  stage  basis  for  each  of  the 
reflection  coefficient  matrices   while  in  the  more  conven-' 
tional  approach,  the  entire  weight  vector  is  adapted.   It 
has  already  been  established  that  the  lattice  structure 
makes  the  model  solutions  recursive  in  order.   Implementing 
the  lattice  adaptivity  makes  the  solution  recursive  in  time 
as  well  since  the  estimate  of  the  solution  at  each  instant 
is  dependent  upon  prior  estimates  of  the  solution, 

The  conventional  adaptive  filter  algorithm  forms  an 
error  signal  as  the  difference  between  some  desired  signal 
and  its  estimate-  i.e. 


e(k)  =  y(k)  -  a+Ck)Tu+(k)  (2.54) 


where  y(k)  is  the  desired  signal,  a   is  the  weight  vector 
and  u  (k)  is  the  input  vector,  and  the  time  update  for  the 
weight  vector  is  given  by  equation  (2,51c).   To  derive  the 
adaptive  AR  lattice  consider  equations  (2.45)  for  a  single 
stage.   The  lattice  in  general  has  vector  error  and  desired 
signals  and  coefficient  matrices  as  opposed  to  scalar  error 
and  desired  signals  and  a  coefficient  vector  in  equation 
(2,54)  but  such  a  generalization  is  straightforward.   Com- 
paring equation  (2.45a)  to  (2.54)  it  is  clear  that: 

1)  e      (k)  is  analogous  to  the  error  signal; 

2)  e    (k)  is  analogous  to  the  desired  signal; 

3)  e    (k-1)  is  analogous  to  the  input  signal  vector. 
Using  the  trace  of  P      as  a  cost  function  and  applying 

a  LMS  adaptive  algorithm  to  determine  the  forward  reflection 
coefficient  matrix  it  follows  that 


^(n+1),,  .,  x  _  „(n+l)n.    0  (n+D-Cn),.  .»  (n+l)M.T 
K      (k+1)  =  K      (k)  +  2y      e    (k-l)e      (k) 

(2.55a) 


With  these  analogies,   equations  (2.51c)  and  (2.55a)  are  seen 
to  be  virtually  identical  with  the  exception  that  (2,51c) 
uses  a  scalar  error  to  adapt  a  weight  vector  and  (2.55a)  uses 
a  vector  error  signal  to  adapt  a  coefficient  matrix. 

Proceeding  in  a  similar  fashion  with  equation  2.45b  it  is 
clear  that: 

1)  e      (k)  is  analogous  to  the  error  signal; 

2)  e    (k-1)  is  analogous  to  the  desired  signal; 


3)  e    (k)  is  analogous  to  the  input  signal  vector. 
With  the  trace  of  P       as  a  cost  function,  the  time 
update  relation  for  the  backward  reflection  coefficient 
matrix  is 


T7(n+1),,  ,,  v  _  «(n+l)M>    0-(n+l)   (n)M*  -(n+l)M,T 
K      (k+1)  =  K      (k)  +  2u       e    (k)  e      (k) 

(2.55b) 

For  a  MA  lattice,  equation  (2.47)  must  also  be  considered 
Multiplying  both  sides  of  (2.47)  by  minus  one  and  adding 
y(k)  results  in 


T 
n    (n+1),.  »  _    (n),,  x    «(n+l)  — (n+1),,  v       ,n    c . , 
eQ      (k)=en    (k)  -  G       e      (k)       (2.56) 


where  e_n      (k)  is  defined  as  in  equation  (2.50).   It  is 
evident  that  ? 

1)  e_n      (k)  is  analogous  to  the  error  signal; 

(  n  1 

2)  e.    (k)  is  analogous  to  the  desired  signal; 

3)  e      (k)  is  analogous  to  the  input  signal  vector. 
With  the  trace  of  P_n       as  a  cost  function,  the  time  up- 
date relation  for  G      is  given  by 


nCn+l),,...  x    n(n+l) ,,.        0  (n+1)  -(n+l),,x    (n+l),,xT 
G      (k+1)  =  G      (k)  +  2y      e      (k)  e,.      (*' 

(2.57) 
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It  is  significant  to  note  that  three  different  adaptive 
gains  have  been  used  in  equations  (2.55)  and  (2.57)  and 
that  the  gains  have  been  superscripted  indicating  that  they 
vary  from  one  lattice  stage  to  the  next.   For  stability  consi- 
derations the  adaptive  gain  used  in  the  LMS  algorithm  must 
satisfy  equation  (2.52a)  and  therefore  is  related  to  the 
largest  eigenvalue  of  the  input  autocorrelation  matrix  by 
equation  (2.53a).   In  developing  the  time  update  relations 
for  the  lattice  coefficients,  three  different  input  signals 
were  used  and  these  inputs  also  differ  from  one  lattice 
stage  to  the  next.   Indeed,  even  for  the  case  where  the  in- 
put x(k)  to  the  lattice  structure  is  stationary,  the  inputs 
to  all  lattice  stages  except  the  first  are  nonstationary 
since  these  inputs  are  outputs  of  other  lattice  stages.   This 
fact  indicates  that  time  varying  adaptive  gains  are  appro- 
priate as  well  in  equations  (2.55)  and  (2.57).   Equation 
(2.53a)  is  of  no  direct  usefulness  however  in  setting  the 
adaptive  gains  since  the  time  varying  eigenvalues  are  not 
known.   Recognizing  that  the  largest  eigenvalue  is  always 
less  than  the  trace  of  the  input  autocorrelation  matrix 
(which  is  a  measure  of  the  power  in  the  input  signal  vector) 
the  gains  can  be  set  as 


(n+1)  ,.,  v       a  ,  0  c  a    \ 
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where  a  L, (k)  ,  a  . , (k)   and  v  ^n (k)   are  estimates  of  the 
n+1       n+1  'n+1 

power  in  the  three  input  signal  vectors  and  a  is  a  normalized 
adaptive  step  size  with  0  <  a  <  1.   A  method  that  has  commonly 
been  applied  to  obtain  these  estimates  is  to  employ  a  first 
order  low  pass  filter  so  that 


dr  .,(k+l)2  =  [l-a]a  ., (k) 2+ae(n) (k~l)Te(n) (k-1)    (2.59a) 
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Taken  together,  equations  (2.55),  (2.57),  (2.58)  and  (2.59) 
define  the  adaptive  solutions  for  the  AR  and  MA  multichannel 
lattice  models. 

To  understand  the  potential  advantage  offered  by  the 
adaptive  lattice  form,  recognize  that  while  the  conventional 
approach  solves  a  high  order  minimization  problem  by  adapting 
all  the  coefficients  at  once,  the  lattice  breaks  the  problem 
down  into  a  succession  of  lower  order  minimizations  at  each 
stage  and  solves  these  lower  order  problems  adaptively. 


dimensionality  of  the  input  autocorrelation  matrices.,  at 
each  lattice  stage  in  general  is  significantly  less  than 
that  of  the  large  input  autocorrelation  matrix  in  the  con- 
ventional adaptive  algorithm  and  consequently  it  is  hoped 
that  the  possibility  of  a  large  eigenvalue  disparity  with 
its  attendant  slow  convergence  is  reduced. 

This  advantage  is  most  evident  in  a  single  channel 
adaptive  lattice  where  the  inputs  at  each  stage  are  single 
signals  and  their  corresponding  autocorrelation  matrices 
are  lxl  in  dimension.   In  this  case  the  ratio  of  smallest 
to  largest  eigenvalues  is  unity  and  the  convergence  of  each 
stage  is  quite  rapid  while  the  convergence  of  the  overall 
model  is  independent  of  the  eigenvalue  ratio  for  the  over- 
all higher  dimension  input  autocorrelation  matrix.   This  has 
been  demonstrated  by  Satorius  [Refs.  46  and  M-7]  who  has 
shown  that  the  single  channel  adaptive  lattice  converges 
much  more  rapidly  than  the  corresponding  conventional  adap- 
tive filter,  and  does  so  independently  of  the  eigenvalue 
ratio  on  the  overall  input  channel  autocorrelation  matrix. 

Furthermore  in  a  single  channel  adaptive  lattice,  the 
time  update  relations  are  simplified  by  the  fact  that  the 
forward  and  backward  reflection  coefficients  are  the  same. 
Using  the  average  of  the  mean  square  values  of  backward 
and  forward  prediction  errors  as  a  cost  function  and  applying 
an  adaptive  algorithm  it  follows  that 
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The  very  nature  of  the  lattice  structure  however  with 
the  output  of  one  stage  providing  an  input  to  the  next  stage, 
greatly  complicates  the  analysis  of  the  convergence  proper- 
ties of  the  adaptive  lattice  model.   Even  when  x(k)  ,  the 
input  to  the  lattice,  is  stationary,  inputs  to  all  stages 
except  the  first  are  nonstationary .   An  approximate  analysis 
of  convergence  and  stability  on  a  stage  by  stage  basis  is 
possible  if  it  is  assumed  that  all  prior  stages  have  con- 
verged and  are  providing  stationary  inputs  to  the  stage 

under  investigation.   With  this  assumption,  the  adaptive 

t  4.-    *   ^   „(n+l)   p-Cn+l)    .  n(n+l)    .  . 
solution  for  the  K      ,  K      and  G      matrices   are 

obtained  from  the  operation  of  three  independent  LMS  algo- 
rithms as  shown  earlier,  with  inputs  given  by  e    (k-1), 
e    (k)  and  e      (k),  respectively.   Stability  limits  on 
the  adaptive  gains  used  in  the  stage  and  the  convergence 
properties  of  the  stage  are  then  determined  by  the  eigen- 
values of  the  P(   ,  P(n)  and  p(n+1)  matrices.    A  more  exact 
analysis  of  the  properties  of  the  adaptive  lattice  that 


considers  the  nonstationary  character  of  the  inputs  to  the 
second  and  subsequent  stages  is  not  currently  available. 
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III.   ARHA  MODELING 

One  of  the  most  serious  disadvantages  of  either  AR  or 
MA  modeling  is  the  fact  that  to  adequately  represent  even 
simple  linear  systems,  both  methods  may  require  a  large 
number  of  parameters  (a  high  order  model) .   This  problem 
arises  since,  from  a  transfer  function  standpoint,  AR  and 
MA  models  attempt  to  model  the  system  using  only  poles  or 
only  zeros,  in  spite  of  the  fact  that  the  physical  system 
may  have  both  zeros  and  poles.   While  modeling  the  effects 
of  a  zero  with  a  number  of  poles  and  visa  versa  can  be 
analytically  justified  as  shown  in  the  previous  chapter, 
it  makes  far  more  sense  (both  from  the  viewpoint  of  model 
accuracy  and  efficient  use  of  model  parameters)  to  let  the 
model  represent  the  system  as  it  really  is  with  both  zeros 
and  poles  if  this  is  at  all  possible.   The  ARMA  model  is  a 
generalization  of  the  AR  and  MA  models  and  accomplishes 
exactly  this ,  representing  the  system  in  rational  transfer 
function  form. 

It  is  worth  noting  that  the  titles  of  all  pole  and  all 
zero  modeling  that  have  been  associated  with  AR  and  MA 
modeling  are  misnomers.   Both  have  equal  numbers  of 
zeros  and  poles.   In  the  AR  model  however,  all  the  zeros 
occur  at  the  origin  of  the  z-.plane  as  do  the  poles  of  a  MA 
model.   The  ARMA  model  removes  these  constraints. 
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After  a  brief  discussion  of  two  alternate  ARMA  modeling 
methods  due  to  Shanks  and  Prony ,  the  equation  error  formu- 
lation for  ARMA  modeling  is  developed  and  the  new  results 
presented.   Model  transition  formulas  relating  the  ARMA 
model  to  the  MA  and  AR  models  are  developed  and  the  input 
signal  requirements  of  the  modeling  process  explored.   It 
is  shown  that  after  suitable  modification,  the  Levinson 
algorithm  can  be  applied  to  solve  the  ARMA  modeling  problem 
recursively  in  order  and  lattice  solution  methods  are  also 
developed  for  both  a  batch  processing  and  an  adaptive  model 
solution.   The  results  of  experimental  simulations  of  both 
of  these  modeling  solution  methods  are  presented  and  dis- 
cussed, and  comparisons  are  made  with  conventional  means  of 
ARMA  modeling  using  the  equation  error  formulation.   Finally 
it  is  also  shown  that  the  lattice  solution  methods  can  be 
generalized  to  solve  for  the  multichannel  ARMA  model  with 
arbitrary  numbers  of  inputs  and  outputs. 

A.   LINEAR  ARMA  MODELING  AND  ITS  RELATION  TO  AR  AND  MA 
MODELING 

The  ARMA  model  for  linear  systems  assumes  the  current 

value  of  the  output  of  the  system  is  given  by  a  weighted 

combination  of  present  and  past  values  of  the  input  and 

past  values  of  the  output.   In  terms  of  the  discussions  of 

Chapter  I,  F3Q  is  assumed  to  be  zero  and  F  -  and  F„Q  take 

on  the  following  forms 

iU 
F1Q[u(k)]  =   £   *(n)  u(k-n)  (3.1a) 
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F   Cy(k-l)]  =    J2        b(n)  yCk-n)  (3.1b) 

l[}  n=l 


leading  to  a  transfer  function  representation  for  the  system 

given  by 
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a(n)  z 

u (    \         n=0  A(z)  ,.,  , * 

H(Z)  =  — ^ _   =   •gTTy  (3.1c) 

1-      b(n)z^n 
n=l 

A  number  of  methods  exist  for  finding  the  model  coef- 
ficients (a(n)  }  and  {b(n)}.   As  stated  in  Chapter  I,  a  MMSE 
solution  via  the  direct  form  modeling  approach  requires  the 
solution  of  a  system  of  highly  nonlinear  equations  and  in 
general  is  untractable.   An  alternative  is  to  first  obtain 
an  estimate  of  the  denominator  polynomial  B(z)  by  some 
means  such  as  AR  modeling  and  then  using  this  in  the  system 
shown  in  Figure  3.1,  estimate  the  numerator  polynomial  A(z) 
by  setting  its  coefficients  to  minimize  the  mean  square 
value  of  the  error.   This  method  was  first  explored  by 
Shanks.  [Ref,  49] 

Another  alternative  is  to  apply  the  Prony  method  [Refs. 
8,  52  and  56]  derived  in  Appendix  E  which  obtains  the  model 
parameters  by  matching  the  impulse  response  of  the  system 
and  model  over  the  first  W+M+l  sample  intervals.   Both  of 
these  techniques  share  a  common  characteristic.   They  both 
start  by  independently  estimating  the  denominator  coefficients 
(or  model  poles)  and  then,  given  this  estimate,  solve  for 
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Figure  3.1.   Shanks  method  for  ARMA  modeling 

the  numerator  coefficients  (or  model  zeros).   This  is  in- 
tuitively unappealing  in  that  one  would  expect  these  two 
estimation  problems  to  be  more  closely  coupled  with  the 
zero  estimates  also  affecting  the  estimates  of  the  poles. 

The  application  of  the  equation  error  formulation  to  the 
ARMA  modeling  problem  permits  simultaneous  estimation  of  the 
model  zeros  and  poles  and  was  first  used  by  Kalman  [Ref.  23] 
in  work  on  self-optimizing  control  systems.   The  prediction 
error  form  of  the  model  is  considered  here  where  F„_  is  set 
to  zero  while 
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The  analysis  model  is  depicted  in  Figure  3.2  where 
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and  these  polynomials  are  the  estimates  of  the  system  trans- 
fer function  numerator  and  denominator. 
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Figure  3.2.   The  equation  error  formulation  for 
ARMA  modeling. 


The  expression  for  the  model  error  can  be  written  as 


eQ(k)  =  y(k)  -  [y(k)T  j  u+(k)T] 


(3.4a) 


where  y(k)  and  b  are  defined  as  in  equation  (2.7)  and 
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(3.4b) 


a+  =  Ca(0)  • • •  a(M)]T 


(3.4c) 


This  results  in  an  expression  for  the  mean  square  error 
which  is  a  quadratic  function  of  the  model  coefficients, 
with  the  MMSE  solution  for  those  coefficients  given  by 
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1 ,   Model  Transition  Relationships 

Comparing  equations  (3.4)  and  (3.5)  with  their  AR  and 
MA  counterparts,  it  is  clear  that  the  ARMA  model  provides  a 
generalization  of  these  other  models  since  they  can  be 
obtained  from  the  ARMA  model  by  assuming  that  either  a   or  b 
is  zero.   Consequently  it  is  susceptible  to  the  same  typ^  of 


bias  introduced  in  the  AR  and  MA  models  by  the  presence  of 
additive  noise  on  either  the  system  input  or  output  signals. 
To  develop  the  relationships  between  these  models  further, 
consider  the  inversion  of  the  correlation  matrix  in  equation 
(3.5a)  in  terms  of  its  component  matricies.   Since  the  left 
and  right  inverses  of  a  nonsingular  square  matrix  are  the 
same,  either 
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can  be  used  to  find  the  required  inverse  in  partitioned  form 
Solving  for  the  right  inverse  of  equation  (3.6b)  yields 
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Solving  for  the  left  inverse  of  equations  (3.6a)  gives  iden- 
tical results  for  A  and  D  while  equivalent  but  different 
forms  are  obtained  for  B  and  C  given  by 
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Using  equations  (3.7)  in  equation  (3.5a),  the  solutions  for 
the  ARMA  coefficient  vectors  are  given  by 
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and 
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The  matrix  inversion  lemma  [Refs.  11  and  19]  which  states 
that 

(E-t-FGH)'1  =  E"1  -  E'1F(G"1+HE"1F)"1  HE^1        •    (3.10) 


78 


for  nonsingular  square  matrices   E,  G  and  E+FGH,  can  be 
used  in  equation   (3.9)  to  rewrite  them  as 
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The  all  zero  and  all  pole  model  solutions  of  corresponding 
orders  however  are 
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where  subscripts  are  used  to  distinguish  these  solutions 
from  their  counterparts  in  the  ARMA  zero  pole  model. 
From  equations  (3.11)  it  follows  that 
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Following  a  similar  development,  the  left  inverse  relation- 
ships of  equations ( 3 . 8 )  can  be  used  to  write 
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Equations  (3.12)  are  termed  the  "Zero  Pole  Model  Transi- 
tion Formulas"  and  specify  the  relationships  between  the 
various  models.   It  is  interesting  to  note  that  equations 
(3.12a)  and  (3.12b)  take  the  form  of  a  linear  observer  with 
a  new  estimate  of  the  solution  given  by  the  old  estimate 
plus  a  gain  times  an  error  term.   To  gain  some  insight  into 
the  functioning  of  these  formulas,  consider  the  form  of  R   (n) 
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Assuming  that  W=N  and  M=M,  and  writing  equation  (3.13a)  for 
-N<n<-1  and  equation  (3.13b)  for  -M  <n  <0  results  in 
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These  constraints  on  the  system  input  and  output  auto  and 
cross  correlation  coefficients  are  the  ARMA  modeling  equa- 
tions of  (3.5a)  with  the  model  coefficients  replaced  by  the 
system  parameters.   In  AR  modeling,  b_Ap  is  set  to  satisfy 
the  constraints  of  equation  (3.14a)  with  the  assumption  that 
a   is  zero.   The  error  term  in  the  model  transition  formula 
(3.12a)  then  checks  this  solution  to  see  if  it  also  satisfies 
the  constraints  of  equation  (3,14b)  still  assuming  that  a. 
is  zero. 


error  =  R     b_Ap  -  r  +  (3.15) 
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If  this  error  is  zero  and  the  constraints  of  equation  (3.14b) 
are  satisfied,  equation  (3.12a)  sets  b_7p  =  b  p  and  equation 
(3.12d)  sets  a   to  zero.   If  however  the  error  is  nonzero, 
(3.12a)  adjusts  b.p  in  proportion  to  the  error  to  obtain 
b_7p  and  (3.12d)  then  provides  a  nonzero  &7p-      Thus  equations 
(3.12a)  and  (3.12d)  are  complementary,  specifying  the  ARMA 
or  zero  pole  model  solution  of  order  M  over  N  when  given 
the  N-th  order  AR  or  all  pole  model  solution. 

In  like  manner,  equations  (3.12b)  and  (3.12c)  give  the 
ARMA  model  solution  of  order  M  over  N  when  given  the  M-th 
order  MA  or  all  zero  model  solution.   The  all  zero  solution 
is  obtained  from  equation  (3. 14b)  assuming  that  _b  is  zero. 
Equation  (3.12b)  checks  this  solution  against  the  con- 
straints imposed  by  equation  (3.14a)  with  the  same  assump- 

+  .  + 

tions  and  adjusts  a.7  appropriately  to  determine  a^p . 

Equation  (3.12c)  then  sets  h_7p,  completing  the  zero  pole 
model  solution. 

2 .   Modeling  Input  Signal  Requirements 

Another  aspect  of  the  modeling  problem  that  must  be 
considered  is  that  of  system  identif iability .   If,  from  the 
available  measurements  of  signals,  a  model  can  be  obtained 
that  accurately  represents  the  system's  operation,  the 
system  is  considered  identifiable.   The  two  issues  that 
arise  therefore  are  the  measurement  requirements  (which 
signals  must  be  measured)  and  requirements  on  the  input 
signal  used  to  excite  the  system  during  the  modeling 
process.   In  the  equation  error  formulation  of  the  ARMA 
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model,  both  existing  signals  (system  input  and  output)  must 
be  observed  (or  at  least  a  knowledge  of  their  auto  and  cross 
correlation  functions  must  be  available) .   Most  discussions 
of  input  signal  requirements  for  identif iability  simply 
state  that  the  system  can  be  identified  if  the  input  signal 
is  sufficiently  rich,  persistent  or  exciting,  eg  [Ref.  19] 
To  explore  the  question  of  input  signal  requirements  fur- 
ther, consider  the  mean  square  equation  error  cost  function 
being  minimized.   Assuming  that  the  equation  error  signal 
is  ergodic  and  has  finite  energy  its  mean  square  value  is 
obtained  via  time  averaging  as 


E2  =  s{e(k)2}  =    £    e(n)2  (3.16) 

n=-co 


Applying  Parsevals  relation  this  becomes 

oo  7T 

E2  =   £   e(n)2  =  i  J     E(ej9)E*(ej9)  d9     (3.17) 
n  =  -°°  -it 

where  0=wT  and  *  indicates  the  complex  conjugate.   The 
equation  error  is  represented  in  the  transform  domain  as 


E(z)  =  [B(z)H(z)  -  A(z)]U(z)  (3.18) 

and  using  this  in  equation  (3.17),  the  cost  function  becomes 

IT 

E2  =  7?  J  |CB(e^9)H(ej9)-A(ej9) |  '  |U(ej9)|  d9   (3.19) 

-7T 


showing  that  the  power  spectrum  of  the  input  signal  acts  as 
a  frequency  dependent  weighting  function  on  a  transfer 
function  error  term.   Therefore  to  identify  the  system 
equally  well  at  all  frequencies,  the  input  must  have  a 
flat  spectrum  as  will  be  the  case  for  a  white  noise  input 
or  an  impulse  function  input.   Otherwise,  the  model  trans- 
fer function  will  only  be  matched  to  the  system  transfer 
function  over  the  range  of  frequencies  where  the  input 
signal  has  significant  power. 

As  an  example  consider  the  equation  error  ARMA 
model  for  a  fourth  order  system  driven  by  a  single  sine 
wave  input  at  a  frequency  of  tt/3.   According  to  equation 
(3.19),  the  model  transfer  function  will  only  be  required 
to  match  the  system  at  this  single  frequency,  and  to  accom- 
plish this,  only  a  first  order  model  is  needed.   Any  in- 
crease in  model  order  above  first  order  therefore  should 
have  no  effect.   Figure  3.3a  shows  a  comparison  of  the  mag- 
nitude spectrum  of  the  fourth  order  system  and  its  first 
order  ARMA  model  obtained  using  the  sinusoidal  input  and 
as  anticipated  they  match  at  the  frequency  it/ 3  (coinciden- 
tally  they  also  match  at  one  other  frequency  as  well). 
Figure  3 . 3b  shows  the  same  comparison  but  with  a  fourth 
order  ARMA  model,   It  is  clear  that  increasing  the  model 
order  failed  to  improve  its  accuracy  and  that  the  model 
accurately  represents  the  system  only  at  the  frequency  of 
the  input  signal  and,  by  coincidence,  at  one  other  frequency 
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(a) 


SYSTEM 

1  ST  ORDER  MODEL 


(b) 


-30 


-40 


-50 


SYSTEM 

4  TH  ORDER  MODEL 


Figure  3.3.   ARMA  models  for  a  system  driven  by 
a  single  sinusoid. 
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It  should  be  noted  that  this  type  of  analysis  to 
determine  input  signal  requirements  could  be  applied  to  the 
AR  and  MA  models  as  well  resulting  in  the  same  conclusions. 

B.   A  RECURSIVE  IN  ORDER  SOLUTION  FOR  THE  ARMA  MODEL 

Since  the  equation  error  formulation  for  the  ARMA  model 
is  a  generalization  of  similar  formulations  for  AR  and  MA 
modeling,  it  is  reasonable  to  assume  that  a  Levinson-type 
algorithm  could  be  devised  to  obtain  the  ARMA  solution 
recursively  in  order,  and  that  from  that  algorithm,  lattice 
filter  methods  applicable  to  the  ARMA  modeling  problem  could 
be  derived.   Attempts  to  develop  such  an  algorithm  directly 
for  the  ARMA  modeling  equation   (3.5a),  however,  fail  to 
provide  useful  results.   The  first  problem  that  arises  is 
in  deciding  which  model  order  to  make  recursive;  the  order 
of  the  numerator  polynomial,  the  order  of  the  denominator 
polynomial  or  both.   If  it  is  assumed  that  the  numerator 
and  denominator  are  of  equal  orders  (M=N) ,  the  ARMA  modeling 
equations  become  as  shown  in  equation  (3.20).   However, 
efforts  to  develop  a  Levinson-type  algorithm  for  this  system 
of  equations,  where  the  numerator  and  denominator  polynomials 
of  the  model  are  incremented  simultaneously  to  obtain  re- 
cursive in  order  solutions,  are  still  frustrated  by  the 
presence  of  the  (N+l)-st  row  and  column  in  equation  (3.20). 
This  arises  because  the  numerator  coefficient  vector  a   is 
a  (N+l) -vector  while  the  denominator  coefficient  vector  b 
is  a  N-vector.   If  it  is  further  assumed  that  the  coefficient 
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a(0)  is  known  in  advance  or  can  be  estimated  in  some  other 
fashion,  equation  (3.20)  for  the  solution  for  the  remaining 
2N  coefficients  becomes  as  written  in  equation  (3.21). 
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uy 
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(3.21) 

The  coefficient  a(0)  essentially  has  the  role  of  a  gain  for 
the  model,  and  a  method  for  estimating  it  after  all  the  other 
coefficients   are  obtained  (as  was  done  in  AR  modeling)  will 
be  discussed  later.   Equation  (3,21)  can  be  written  as 
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-a(0) 


(3,22) 


Q« 


Now  consider  the  form  of  a  two  channel  autoregressive 
model  where  the  two  input  channels  are  y(k)  and  u(k) ;  that  is 


x1(k)  =  y(k) 


x2(k)  =  u(k) 

Using  equations  (A.M-a)  and  (A.  7b)  the  two  channel  AR  modeling 
equations  are 


R      R 
--yy    -yu 


R      R 
— uy    — uu 


d-ll     ^12 


^21     ^22 


r       r 
-yy      -yu 


r       r 
--uy     — uu 


(3.23) 


and  comparing  equations  (3.22)  and  (3.23)  it  is  clear  that 
with  the  exception  of  the  gain  term  a(0),  the  ARMA  modeling 
solution  can  be  obtained  from  the  two  channel  AR  solution 
of  (3.23).   Furthermore  equation  (3.23)  can  be  solved  inde- 
pendently of  the  gain  term  via  the  Levinson  algorithm  as 
shown  in  Appendix  A  or  via  the  multichannel  lattice  methods 
developed  in  the  previous  chapter.   Then  all  that  remains 
to  complete  the  solution  for  the  ARMA  model  is  to  estimate 
the  gain  term  a(0)  and  solve  for  the  other  model  coefficients 
using 


4n 


$21 


k.2 


-d-2  2 


(3,24) 


-a(0) 
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From  this  it  follows  that  the  transfer  functions  A(z)  and 
B(z)  for  the  ARMA  prediction  error  analysis  model  can  be 
related  to  the  two  channel  AR  prediction  error  matrix  poly- 
nomial transfer  function  by 


B(z) 


-A(z) 


CI  -  D.Cz)] 


-a(0) 


(3.25) 


It  is  also  easy  to  relate  the  ARMA  equation  error  to  the 
two  channel  AR  prediction  error  vector.   Using  equations 
(A. 3)  and  (A, 5),  the  two  channel  AR  prediction  error  vector 
is  written  as 


e(k) 


r-   -.T 
y(k) 

u(k) 


[y(k)T;  u(k)T] 

i 


^11    ^12 
^21    ^2  2 


(3,26) 
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Defining 


*  = 


1 
-a(0) 


(3,27a) 
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and  postmultip lying  equation  (3.26)  by  ^  and  using  (3.24)  it 
follows  that 


e(k)T  ip    =  y(k)  -  a(0)u(k)  -  [y(k)Ti  u(k)T] 


(3.27b) 


But  from  equation  (3,4a)  this  is  exactly  the  ARMA  model 
equation  error  so  that 


eQ(k)    =    e(k)1    £ 


(3.27c) 


and 


e{e0(kr} 


T 
\p      P    \\i 


(3 .27d) 


where  P  is  the  forward  prediction  error  covariance  matrix 
for  the  2  channel  AR  model.   Equation  (3.27d)  also  provides 
a  means  of  estimating  the  gain  term  a(0)  after  the  two  channel 
AR  solution  has  been  obtained  by  setting  it  to  minimize  the 
mean  square  value  of  equation  error  resulting  in 


a(0)  = 


e{e  (k)e  (k)} 
u y_ 

s{eu(k)2} 


(3,28) 


and  completing  the  ARMA  model  solution 
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The  portion  of  the  ARMA  model  solution  in  equation  (3.24) 
given  by  the  two  channel  AR  solution  can  be  found  recursively 
in  order  using  either  the  Levinson  algorithm  or  the  lattice 
filter  techniques ,   If  the  desired  ARMA  model  order  is  not 
known  in  advance,  a  model  gain  term  a(0)  can  be  estimated 
for  each  order  two  channel  AR  solution  to  find  the  ARMA 
model  of  corresponding  order  along  with  its  MSE .   In  this 
fashion,  the  entire  family  of  ARMA  models  for  the  system  from 
order  zero  to  order  N,  along  with  their  mean  square  errors 
are  obtained.   If,  on  the  other  hand,  the  desired  ARMA  model 
order  is  known  apriori,  the  gain  term  need  not  be  calculated 
at  each  stage .   Only  one  gain  term  must  be  calculated  to 
obtain  the  ARMA  solution   after  the  appropriate  order  two 
channel  AR  solution  has  been  found. 

It  has  already  been  shown  that  to  fully  identify  the 
system  using  the  equation  error  ARMA  formulation,  the  input 
signal  must  have  a  flat  spectrum  as  in  the  case  of  white 
noise.   When  white  noise  is  used  as  the  system  input  u(k) , 
simplifications  emerge  in  the  solution  of  the  two  channel 

AR  model  via  the  Levinson  algorithm  or  lattice  methods. 

2 
For  a  white  sequence  with  variance  a      , 
n  u  ' 
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and 
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n  <  0 


R   (n)  = 
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u  — 


(3.29b) 


where  h(n)  is  the  sampled  impulse  response  of  the  system. 
Consider  equation  (A, 16c)  for  K 
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This  shows  that  k, ~   and  k««   are  zero  and  furthermore  since 

(n)    ,   (n)  -     ,  ,    ...        ,, 

r  and  r    are  zero  for  all  n  it  is  seen  that 
— yu      — uu 
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(n) 

'22 


(3.29d) 


for  all  n  as  well.   This  can  readily  be  understood  by  con- 
sidering the  role  of  these  two  coefficients  at  each  stage 
in  the  AR  prediction  of  y(k)  and  u(k) .   k  ~  and  k?~  are  the 
coefficients  used  in  trying  to  predict  u(k)  from  past  values 
of  y(k)  and  u(k) ,  and  when  u(k)  is  a  white  sequence   it 
cannot  be  predicted,  forcing  these  coefficients  to  be  zero, 
No  such  simplifications  occur  in  the  backward  prediction 
problem  (and  therefore  in  the  K"  matrices)   since  even  for 
a  white  u(k) ,  a  backward  prediction  of  u(k-n)  from  subse- 
quent values  of  u  and  y  is  possible.   This  is  because  in 


general,  a  linear  dependence  of  y(k)  upon  past  and  present 
values  of  u(k)  can  occur  (and  certainly  will  occur  when  the 
relationship  between  y(k)  and  u(k)  is  described  by  an  ARMA 
model)  , 

As  a  result  of  these  simplifications,  it  is  seen  from 
equation  (A. 19a)  that  the  polynomials  d  ~(z)  and  d??(z)  are 
zero  when  u(k)  is  white  and  the  ARMA  model  is  given  by 


B(z) 


-A(z) 


1-d  u(z) 


-d21(«) 


-a(0) 


(3.30) 


In  this  special  case  it  follows  that 


B(z) 


detCl  -D  (z>] 


(3.31) 


and  therefore  stability  of  the  ARMA  model  and  of  the  two 
channel  AR  model  are  equivalent.   In  general,  however ,  no 
such  connection  exists  for  arbitrary  input  signals.   Further- 
more, even  when  the  system  input  u(k)  is  white,  solutions 
for  k  ?  and  k??  will  not  in  general  be  exactly  zero  since 
the  required  correlations  are  usually  not  known  and  must  be 
estimated . 

This  development  showing  that  the  ARMA  model  solution 
can  be  obtained  from  a  two  channel  autoregressive  model  de- 
pends on  two  assumptions ; 

1)   The  numerator  and  denominator  polynomial  orders  in 
the  model  transfer  function  are  assumed  to  be  the 
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same  and  are  incremented  simultaneously  to  build  up 
the  desired  solution  recursively  in  order; 
2)   It  is  assumed  that  the  coefficient  of  z  to  the  zero 

power  (aCO))  in  the  numerator  polynomial  of  the  model  is 
either  known  in  advance,  or  that  another  means  of 
estimating  it  can  be  found  so  that  it  need  not  be 
estimated  directly  in  the  modeling  equations  of 
(3.20)  , 
The  second  assumption  causes  no  concern  since  the  two  channel 
autoregressive  solution  is  obtained  independently  of  a(0), 
and  given  that  solution,  it  has  been  shown  that  a(0)  can 
indeed  be  estimated  in  another  fashion  in  equation  (3.28). 
The  first  assumption  however,  warrants  further  consid- 
eration  since  it  seems  somewhat  restrictive  (at  first)  to 
require  that  the  numerator  and  denominator  polynomials  of 
the  model  have  the  same  order  when  in  fact,  the  system  being 
modeled  may  have  different  order  numerator  and  denominator 
polynomials.   To  see  that  this  assumption  is  not  restrictive 
in  general,  consider  what  is  occuring  as  the  model  is  built 
up  recursively  in  order.   At  each  model  order  n,  the  pro- 
cedure finds  the  best  n~th  order  model  (with  n  zeros  and  n 
poles)  in  a  minimum  mean  square  equation  error  sense. 
Making  the  model  numerator  and  denominator  orders  different 
(or  equivalently ,  forcing  some  of  the  coefficients  to  zero 
in  the  model  where  the  orders  are  the  same)  places  a  priori 
constraints  on  the  model,  forcing  some  of  the  poles  or 
zeros  to  the  origin  in  the  z-plane,  rather  than  allowing  the 


model  to  place  them  at  will  to  minimize  the  cost  function. 
As  an  example,  consider  the  process  of  obtaining  an  ARMA 
model  for  a  system  given  by 

2 
£  a(n)  z~n 

tf(z)   =   a=I2* 

1-  £  fa(n)  z~n 

n  =  l 

where  two  of  the  systems  four  zeros  actually  occur  at  the 
origin  of  the  z  plane.   Constraining  any  of  the  model  zeros 
to  the  origin  at  orders  one,  two  or  three  will  result  in  a 
model  with  higher  cost  (MSE)  than  if  they  were  not  con- 
strained.  Even  at  order  three,  a  model  without  constraints 
can  be  expected  to  use  the  "extra"  zero  to  help  in  approxi- 
mating the  effects  of  the  system's  fourth  pole  as  shown  in 
equation  (2.13)  yielding  a  lower  cost  and  more  accurate 
model  than  would  result  if  one  zero  were  forced  to  the  origin, 
Only  at  order  four  are  such  constraints  reasonable  but  even 
then,  they  are  not  necessary  since  the  modeling  procedure 
itself  should  recognize  that  the  best  fourth  order  model 
will  have  two  zeros  at  z=0. 

Therefore,  it  is  seen  that  assuming  equal  orders  for  the 
model  numerator  and  denominator  is  entirely  reasonable  as  a 
general  approach  in  obtaining  MMSE  models  for  unknown  sys- 
tems or  even  reduced  order  models  for  known  systems.   When  it 
is  known  in  advance  that  the  best  MMSE  model  for  a  system 
has  zeros  at  z  =  0,  imposing  such  a  constraint  on  the  model  can 
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reduce  the  computational  complexity  of  obtaining  the  solution, 
but  even  here  the  assumption  of  equal  orders  is  not  restric- 
tive. 

C,   LATTICE  FORM  ARMA  MODELING 

In  chapter  two  it  was  shown  that  the  lattice  structure 
of  Figure  2,7  could  be  applied  to  solve  the  multichannel 
AR  modeling  problem  in  terms  of  the  reflection  coefficient 
matrices   given  by  equations  (2.49).   Since  the  ARMA  model 
solution  can  be  obtained  from  a  two  channel  AR  solution  with 
y(k)  and  u(k)  as  the  input  channels,  only  the  structure 
described  by  equation  (3,27c)  need  be  added  to  a  two  channel 
AR  lattice  to  obtain  the  lattice  form  of  the  ARMA  analysis 
model.   It  is  interesting  to  look  at  the  exact  structure  of 
this  lattice  model  as  shown  in  Figure  3.4-  for  a  second  order 
case.   This  structure  is  seen  as  a  lattice  interconnection 
of  two  single  channel  AR  lattices  operating  on  the  input 
signals  y(k)  and  u(k).   The  coefficients  on  the  main  diag- 
onals  of  the  K  and  K  matrices   specify  the  single  channel 
lattices  while  the  off  diagonal  elements  specify  the  inter- 
connections.  (This  will  also  be  the  case  for  extensions 
to  lattices  with  any  number  of  channels.) 

The  ARMA  synthesis  model  implementing  the  transfer  func- 
tion A(z)/B(z)  can  also  be  put  in  lattice  form.   The  for- 
ward prediction  in  the  two  channel  AR  analysis  lattice  is 
described  by 
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and  the  equation  for  e,  (k)       can  be  rewritten  as 


,.  *(n)      ,,  x(n+l),,  (n+D—  ,,  -.sCn),,  (n+D—  ,.  ,x(n) 
e  (k)     =  e  (k)     +  k,  -,     e  (k-1)    +k01     e  (k-1) 
y  y  11      y  21      u 

(3.33) 


Equation  (3.33)  along  with  the  equation  of  e   in  (3.32),  and 
the  equations  for  the  backward  prediction  (2.45b)  describe 

the  structure  shown  in  Figure  3.5  for  a  second  order  case. 

(  2  ) 

To  provide  the  required  input  at  e  (k)    ,  recognize  from 

equation  (3.27c)  that 


e  (k)(2)  =  en(k)  +  a(0)e  (k)(2) 
y  0  u 


(3.34a) 


If  the  ARMA  model  is  an  accurate  representation  of  the 
system,  the  equation  error  e  (k)  will  be  quite  small  (ideally 
zero)  so  that  in  general  for  the  N-th  order  case 


,,x(N)     ,n.    ,,  v(N) 
e  (k)     -  a(0)  e  (k) 

y  u 


(3.34b) 


This  is  indicated  by  the  dashed  feedback  path  in  Figure  3 . 5 
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D.   ARMA  MODEL  SIMULATIONS;  BATCH  PROCESSING 

ARMA  modeling  procedures  for  linear  systems  using  both 
the  lattice  filter  method  and  a  brute  force  matrix  inver- 
sion with  equation  (3.20)  have  been  implemented  and  the 
results  of  these  two  approaches  have  been  compared  in  over 
a  thousand  model  simulations  of  more  than  thirty  different 
linear  systems.  The  experimental  results  which  follow  are 
a  representative  sampling  of  these  simulations. 

In  the  lattice  filter  method,  equations  (2.49)  were  used 
to  calculate  the  forward  and  backward  reflection  coefficient 
matrices ,   with  time  averages  over  a  specified  interval  used 
to  estimate  the  required  correlations  in  P     =  P    and 
A(n)  for  0  <  n  <  N-l.   Equations  (A. 15),  (A. 16a)  and  (A. 16b) 
were  used  to  obtain  the  two  channel  AR  model  coefficients 
from  the  K  and  K  matrices   and  with  the  gain  calculated 
in  equation  (3.28),  equation  (3.24)  was  used  to  obtain  the 
desired  ARMA  model  coefficients.   Equations  (A. 18)  were 
used  to  update  the  forward  and  backward  prediction  error 
covariance  matrices   from  one  lattice  stage  to  the  next. 
Figure  3.6a  provides  a  flow  diagram  of  the  procedure. 

In  the  brute  force  matrix  inversion  method  with  equation 
(3.20)  time  averaging  was  again  employed  to  estimate  the 
required  correlation  coefficients.   A  rectangular  window 
was  applied  to  the  data,  however,  to  retain  the  even  symmetry 
of  the  autocorrelation  function  in  these  estimates .   The 
ARMA  model  coefficients  were  then  obtained  using  a  general 
purpose  library  subroutine  (which  employed  gaussian 
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elimination)  to  solve  equation  (3.20).   Figure  3.6b  provides 
a  flow  diagram  of  this  procedure. 

In  both  cases  zero  mean,  unit   variance  gaussian  white 
noise  was  used  as  the  system  input.   In  the  simulation  re- 
sults that  follow,  a  description  of  each  system  discussed 
(transfer  function  coefficients,  zero  locations  and  pole 
locations)  is  listed  in  tabular  form  and  root  locations  in 
the  z  plane  as  well  as  transfer  function  magnitudes  are 
plotted  for  the  system,  and  various  models  obtained  for  it. 
In  each  case,  models  were  obtained  for  averaging  intervals 
of  200,  500,  1000,  2000  and  4-000  data  points.   Only  the 
results  for  the  two  extremes  of  200  and  4-000  points  are 
included  here . 

The  first  system  considered  has  a  second  order  numerator 
in  z  inverse  and  a  fourth  order  denominator,  and  its  char- 
acteristics are  listed  in  Table  3.1.   Figure  3.7  shows  a 
comparison  of  the  root  locations  and  transfer  function 
magnitude  of  this  system  with  those  of  fourth  order  lattice 
filter  and  brute  force  models  obtained  when  correlations 
were  estimated  by  averaging   over  200  samples.   Figure  3.8 
provides  the  same  comparison  for  a  longer  averaging  interval 
of  4000  samples  of  data.   While  both  methods  perform  com- 
parably with  the  longer  averaging  interval,  the  lattice 
method  produces  a  far  more  accurate  model  with  the  short 
averaging  interval.   It  is  also  interesting  to  compare  the 
performance  of  the  two  methods  when  the  system  is  overmodeled; 
that  is ,  when  the  model  order  is  increased  beyond  that  of 
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Flow  diagram  of  batch  processing 
lattice  solution  for  all  ARMA 
models  orders  1  to  N. 
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Flow  diagram  of  batch  processing  brute 
force  solution  for  all  ARMA  models  of 
order  1  to  N. 
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TABLE    3.1 

SYSTEM    A 

TRANSFER    FUNCTICN  COEFFICIENTS 

NUMERATOR  DENOMINATOR 
A(C)=      0.25000 

A(l)=       0.35000  B(l)  =       i. 14000 

A<2)  =       0.24500  6(2)^    -1.45490 

A(3)  =       0.0  B(3)=       C. 88490 

A(4)=       0.0  B(4)=    -0.40745 


ROOT    LOCATIONS 


ZEROS 


RE 
•C7CC0G 
•0.7CC00 

CO 

CO 


IM 
C.7000C 
-0.7000Q 
0.0 
0.0 


PCLES 


RE 
0.50000 
0.5000C 
0.07000 
0.07000 


IM 

0.50000 
-C.5000C 

0.90000 
-0.90000 
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Figure  3 . 7 
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the  system.   Figures  3.9  and  3.10  show  such  a  comparison 
when  sixth  order  models  are  obtained  for  this  fourth  order 
system.   Ideally,  the  extra  zeros  and  poles  should  lie  at 
the  origin  in  the  z  plane  making  the  additional  transfer 
function  coefficients  zero.   Figure  3.3  shows  that  for  a 
200  point  averaging  interval,  the  lattice  locates  the  extra 
roots  in  the  vicinity  of  the  origin  while  the  brute  force 
model  does  not.   When  the  averaging  interval  is  increased 
to  4-000  points,  the  extra  roots  of  the  lattice  model  move 
in  toward  the  origin  while  those  of  the  brute  force  model 
do  not.   Instead,  a  zero  pole  cancellation  at  some  arbitrary 
location  occurs  in  the  brute  force  model.   In  all  cases 
investigated  during  this  effort,  the  lattice  method  clus- 
tered the  extra  roots  in  the  vicinity  of  the  origin  and  as 
the  averaging  interval  was  increased  to  take  in  more  data, 
these  roots  were  consistently  moved  in  closer  to  the  origin. 
This  property  is  further  evidenced  by  the  plot  of  mean  square 
equation  error  as  a  function  of  model  order  shown  in  Figure 
3.11  for  a  2  00  point  averaging  interval.   The  MSE  for  the 
lattice  model  flattens  out  at  order  four  indicating  that 
further  increases  in  model  order  fail  to  increase  its 
accuracy.   Meanwhile  the  MSE  for  the  brute  force  model  con- 
tinues to  decrease  beyond  fourth  order  as  it  uses  the  addi- 
tional roots  to  reduce  modeling  errors  caused  by  innaccuracies 
in  the  fourth  order  model. 
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Mean  square  value  of  equation  error 
(as  a  percentage  of  the  mean  square 
value  of  system  output)  vs  model  order 
for  lattice  and  brute  force  models  of 
system  A  with  200  point  averages. 
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To  investigate  the  ability  of  these  modeling  methods  to 
distinguish  roots  located  near  one  another  in  the  z  plane, 
a  pair  of  zeros  were  added  to  the  previous  system  in  close 
proximity  to  one  of  the  pole  pairs.   The  characteristics  of 
this  system  are  listed  in  Table  3.2.   Figures  3.12  and  3.13 
show  the  lattice  method  and  brute  force  modeling  results  for 
200  and  4-000  point  averaging  intervals.   With  data  over  only 
20  0  sampling  instants,  neither  method  is  able  to  accurately 
model  the  effects  of  the  adjacent  roots.   When  the  averaging 
interval  is  increased,  the  lattice  correctly  models  the 
plant  while  the  brute  force  method  does  not,  and  even  results 
in  an  unstable  model.   (Figure  3.13b  comparing  the  transfer 
function  magnitudes  has  been  plotted  in  spite  of  the  model 
instability. ) 

To  investigate  the  ability  of  these  zero  pole  modeling 
methods  to  model  systems  that  are  actually  all  zero  or  all 
pole,  the  systems  listed  in  Tables  3.3  and  3.4  were  used. 
The  modeling  results  are  shown  in  Figures  3.1M-,  3.15,  3.16 
and  3.17. 

The  conclusions  drawn  from  the  results  of  this  experi- 
mental study  are  as  follows : 

1)  For  short  data  lengths,  the  lattice  filter  method 
provides  more  accurate  models  than  does  the  brute 
force  modeling  method. 

2)  When  the  system  is  overmodeled  using  the  lattice 
method,  the  excess  roots  are -clustered  about  the 
origin  in  the  z  plane  and  as  the  averaging  interval 
is  increased  and  more  data  taken  in,  these  roots  move 
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TABLE    3.2 

SYSTEM    B 

TRANSFER    FUNCTICN  COEFFICIENTS 

NUMERATOR  DENOMINATOR 
A(CJ-      l.COCOO 

A(l)  =       1.34000  B( 1)=       1.14000 

A<2)  =       1.79940  B(2)  =    -1.45490 

A<3)=*       1.2060C  B(3)=       C. 88490 

A(4)=       0.88533  8(4)  =    -0.40745 


ROCT    LOCATIONS 


RE 

•C.7CCCC 

C7CC0C 

0.03000 

C.C3CCC 


ZEROS 

IM 

0.7000C 

-0.700CC 

0.S500C 

-0.9500C 


PCLES 


RE 
0.50000 
0.50000 
0.07000 
0.07000 


0.50COC 
-0.50000 

C.SOCCC 
-0.9000C 
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TABLE    3.3 

SYSTEM    C 
TRANSFER    FUNCTICN    COEFFICIENTS 
NUMERATOR  DENCMIMTOR 

A<CJ  =       l.COCOO 

A(l)  =    -1.50000  B(  1)=       C.C 

A(2)=       C. 62500  B(2)=       O.C 


ZEROS 
RE  IM 

0.75C00  0.2500C 

C.15CCC         -0.2500C 


ROOT    LOCATIONS 

RE 
0.0 
0.0 


FCIES 


IM 


0.0 
0.0 


TABLE    3.4 

SYSTEM    D 

TRANSFER    FUNCTICN  COEFFICIENTS 

MIMEPATOR  OENCMIMTCR 
A(C)=      l.COCOO 

A(l)=       0.0  B(l)=       1.50000 

A<2)  =       CO  B(2)=    -0.62500 


ZEROS 


RE 
0.0 

0.0 


IM 


0.0 
0.0 


ROOT    LOCATIONS 
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IM 

0.75000 
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consistently  in  toward  the  origin.   This  also  forces 
the  excess  transfer  function  coefficients  and  re- 
flection coefficients  to  be  very  small  (ideally  zero), 
clearly  indicating  that  the  model  order  is  higher 
than  necessary.   The  brute  force  method,  on  the  other 
hand,  scatters  the  excess  roots  throughout  the  z 
plane  and  produces  cancellations  of  the  excess  zeros 
and  poles.   This  results  in  nonzero  values  for  the 
excess  transfer  function  coefficients. 
3)   The  MSE  as  a  function  of  model  order  is  generally 
better  behaved  for  the  lattice  method  than  for  the 
brute  force  method,  decreasing  rapidly  until  the 
correct  model  order  is  reached  and  then  failing  to 
decrease  substantially  as  the  model  order  is  increased 
further . 
Two  qualitative  explanations  for  the  improved  performance 
of  the  lattice  filter  modeling  method  are  offered.   The 
first  is  that  the  data  is  not  windowed  in  the  lattice  method 
while  a  window  that  is  nonzero  only  over  the  span  of  the 
averaging  interval  must  be  used  in  the  brute  force  method  to 
maintain  even  symmetry  in  the  autocorrelation  function  esti- 
mates.  The  effects  of  this  difference  between  the  two  modeling 
methods  should  be  most  noticeable  for  short  data  lengths,  and 
become  less  evident  as  the  length  of  the  averaging  interval 
is  increased.   The  second  possible  cause  for  the  lattice 
method's  better  performance  is  that  the  actual  output  se- 
quences of  the  n-th  order  lattice  are  used  to  calculate  the 
coefficients  at  the  (n+l)-st  order  stage.   In  this  manner, 


modeling  errors  that  have  occurred  in  the  n-th  order  lattice 
can  be  compensated  for  to  some  extent  in  the  (n+l)-st  order 
stage.   No  similar  phenomenon  is  evident  in  the  brute  force 
modeling  approach.   Consider  the  differences  between  the 
Levinson  algorithm  (which  is  equivalent  to  the  matrix  inver- 
sion method)  of  equations  (A. 16)  and  the  lattice  method  given 
by  equations  (2.49).   Both  methods  calculate  the  corrections 
that  must  be  made  to  the  n-th  order  model  to  obtain  the 
(n+l)-st  order  model  in  terms  of  K      and  K       and 
theoretically , 


T  — 
s {e    (k)  e    (k-1)  }  =  R   (n+1)   -  P  \j 

— ^c^c  — —         — — 


When  correlations  are  estimated  by  averaging  over  finite 
intervals  however  this  equality  will  not  in  general  be 
satisfied  making  the  two  methods  different.   The  Levinson 
algorithm  will  estimate  the  correction  terms  to  be  added  to 
the  optimum  n-th  order  model  while  the  lattice  method  esti- 
mates the  correction  terms  to  be  added  to  the  estimated  n-th 
order  model  actually  obtained. 

The  improved  performance  of  the  lattice  method  is  not 
achieved  without  cost,  however .   The  method  is  made  compu- 
tationally expensive  by  the  need  to  store  the  system  input 
and  output  sequences  and  the  lattice  prediction  error  se- 
quences and  pass  them  through  successive  stages  of  the 
lattice  as  it  is  built  up  in  order.   The  computational  com- 
plexity of  the  two  methods  is  compared  in  Table  3,5. 
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Table  3,5.   Computational  requirements  for  batch 
processing  ARMA  modeling  of  an  N-th 
order  system  using  P  samples  of  the 
system  input  and  output. 


Number  of 
Correlation 
Estimates 
Required 

M-N+3   averaged  over 
P  data  points 

UN+3 

Matrix 
Inversions 

1  -  dimension  2N+1 

2N-dimension  2 
1-dimension  1 

Data  Storage 
Requirements 

N.A. 

2P  samples 

Computations 
To  Pass  Data 
Through  The 
Filter 

N.A. 

8NP  multiplica- 
tions 
4NP  additions 
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E.   ADAPTIVE  LATTICE  ARMA  MODELING 

In  addition  to  the  batch  processing  method  described  in 
the  previous  section,  the  lattice  ARMA  analysis  model  can  be 
implemented  adaptively  as  well.   The  adaptive  lattice  solu- 
tion for  the  multichannel  AR  lattice,  which  solves  most  of 
the  ARMA  modeling  problem,  has  already  been  described  in 
chapter  II.   To  make  the  lattice  ARMA  model  adaptive,  only 
an  adaptive  solution  for  the  gain  term  a(0)  need  be  added. 
To  avoid  ambiguity,  the  time  varying  adaptive  estimate  of 
this  term  at  time  k  is  denoted  by  a  (k).   Applying  an  LMS 
adaptive  algorithm  it  follows  that 


aQ(k+l)  =  aQ(k)  -  uQ  V(k)  (3.35) 


and  using  equation  (3.27d)  to  form  an  instantaneous  estimate 
of  the  gradient  yields 


V(k)  =  -2  e  (k)[e  (k)-an(k)e  (k)]  (3.36) 

u      y      0     u 


=  -2  e  (k)  e  (k) 
u     o 


so  that 


aQ(k+l)  =  aQ(k)  +  2uQ  e  (k)  eQ(k)  (3.37) 
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Here  it  is  clear  that 

1)  en^)  ^s  analagous  to  the  error  signal. 

2)  e  (k)  is  analagous  to  the  input  signal. 

3)  e  (k)  is  analagous  to  the  desired  signal, 
so  that  for  stability  un  must  satisfy 


0    <  \in    < 7-  (3.38) 

0     e{eu(kr} 


Once  again,  however ,  the  mean  square  value  of  e  (k)  may  vary 
from  stage  to  stage  and  over  time  as  the  two  channel  AR 
lattice  adapts  making  it  appropriate  to  apply  relations 
similar  to  equations  (2.58)  and  (2.59) 


^0  (k)   =  a^TkT  (3'39a) 


an  (k)  =  (1-a)  an  (k-1)  +  a  e  (k)2  (3.39b) 

0  0  u 


where  a  is  the  normalized  adaptive  step  size  and  the  depen- 
dence on  order  is  implicit  (a  superscript  (n)  could  be  used 
on  an ,  an,  V,  and  all  the  error  terms  in  equations  (3.35) 
through  (3.39)  to  explicitly  denote  their  dependence  on  the 
order  of  the  solution) , 

This  adaptive  lattice  ARMA  modeling  scheme  was  implemen- 
ted and  the  results  of  its  use  in  modeling  system  A  described 
in  Table  3.1  are  presented  here.   A  normalized  adaptive  step 
size  of  a  =  .05  was  used  in  the  following  simulations  and 
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the  results  represent  an  ensemble  average  of  one  hundred 
trials.   Unity  variance  white  noise  was  used  as  the  system 
input.   A  flow  diagram  of  the  procedure  is  shown  in  Figure 
3.18. 

Figure  3.19  shows  a  plot  of  the  mean  square  equation 
error  as  a  function  of  time  for  a  fourth  order  model  while 
it  adapts  and  Figure  3.2  0  shows  the  behavior  of  one  term  in 
the  K  matrix,  k    ,  and  one  term  in  the  K  matrix,  k,  ,  ,  at 
the  first  five  lattice  stages,  1  _<  n  <_  5 .   These  graphs  of 
the  k.. ,  terms  clearly  show  the  successive  stage  by  stage 
manner  in  which  the  lattice  model  adapts.   Figure  3.21  shows 
a  comparison  of  the  transfer  function  magnitude  and  root 
locations  of  the  system  with  those  of  the  fourth  order 
adaptive  model  after  1000  and  2000  iterations.   Figure  3.22 
shows  the  same  comparison  in  the  overmodeled  case  when  a 
sixth  order  model  is  obtained  for  this  fourth  order  system. 

While  these  results  show  that  the  adaptive  solution 
performs  well  and  is  a  viable  alternative  to  the  batch  pro- 
cessing solution,  Figures  3.22c  and  3.22d  show  that  one 
advantageous  characteristic  of  the  batch  solution  has  not 
carried  over.   The  excess  roots  in  the  overmodeled  case  are 
not  tightly  clustered  in  the  vicinity  of  the  origin  indi- 
cating that  the  excess  transfer  function  coefficients  are  not 
near  zero.   In  examining  Figure  3.2  0  it  is  also  evident  that 
the  convergence  of  the  reflection  coefficients  at  the 
overmodeled  lattice  stage  (k...   and  k, ,  )  towards  their  true 
value  of  zero  is  quite  slow.   This  relatively  slow  tracking 
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Initialize  the  parameter 
and  step  size  normalizing 
factors 

\  r 

Sample  system  input 
and  output 

v 

Find  current  value  of  e    (n) 
and  e(n)(k)  for  0  <  n  <   N 

s/ 

Update  the  step  size 
normalizing  factors 
using  equations  (2.59) 

** 

Update  the  K(n)  and  K(n) 
estimates  using  equations 
(2.55) 

and  (2.58)  for  1  _<  n  <  N 

> 

' 

Update  an    using  equations 
(3.37)  and  (3.39a)  for  n 
equal  to  the  desired  order 
ARMA  solution 

_  \ 

s 

If  desired,  calculate  ARMA 
coefficients  from  K,  K  and  a 
estimates  using  equations 
(A. 15) ,  (A. 16)  and  (3.24)  . 

•k 

/ 

Figure  3  ,  II 


Flow  diagram  of  adaptive  lattice  ARMA 
solution . 
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Figure    3.19 

Mean  square  value  of  equation  error  for  the 
fourth  order  adaptive  lattice  model  as  a 
function  of  time. 
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Figure    3.20a 

K(l,l)  term  from  the  forward  reflection 
coefficient  matrices  at  the  first  five 
stages  of  the  adaptive  lattice  model  as 
a  function  of  time. 
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Figure    3.2  0b 

K(l,l)  term  from  the  backward  reflection 
coefficient  matrices  at  the  first  five 
stages  of  the  adaptive  lattice  model  as 
a  function  of  time. 
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Figure  3.21 
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of  overmodeled,  zero  valued  parameters  has  been  found  to  be 
a  general  characteristic  of  the  adaptive  lattice  algorithm 
and  has  also  been  noted  briefly  by  Morf.  [Ref.  38] 

To  understand  the  reason  for  this  behavior,  consider 
what  occurs  as  the  overmodeled  fifth  stage  in  the  previous 
example  adapts.   Initially,  before  the  coefficients  of  the 
first  four  lattice  stages  converge  to  their  optimum  values, 
the  prediction  error  sequences  out  of  the  fourth  stage  are 
large  and  suboptimum.   These  signals  provide  incorrect  in- 
puts to  the  fifth  stage  driving  its  parameter  estimates  to 
some  values  other  than  their  optimum  zero  values.  As  the 
first  four  stages  converge  ,  the  prediction  error  sequences 
going  into  the  fifth  stage  get  small  and  since  they  drive 
the  gradient  estimates ,  convergence  back  toward  zero  is 
quite  slow  in  these  parameter  estimates. 

The  cost  functions  being  minimized  at  the  overmodeled 
fifth  stage  are  the  trace  of  P     and  P     given  by 


P<5>  =  p<*>  .  AU)  K(5)  -  K(5)TAU)T  +  k(5)V4)K(5) 


(3.40a) 


(5)    _-(4)    A(4)\(5>  _  K(5)TA(l°  +  k(5)TPU)K(5) 


(3.40b) 


Applying  the  results  of  Appendix  F,  the  parabolic  surfaces 

defined  by  these  cost  functions  are  described  by  the  eigen- 

(4)      -(4)  . 

values  and  eigenvectors  of  P     and  P    ,  the  prediction  error 

covariance  matrices  at  the  fourth  stage.   Consider  the  forward 
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predictions .   The  actual  system  output  is  given  by 


4  4 

y(k)  =  £  b(i)  yCk-i)  +  L  a(i)  u(k-i) 


(3,41) 


L=l 


i  =  0 


and  for  a  white  input  signal,  the  minimum  errors  in  the 
fourth  order  two  channel  autoregressive  predictions  of  y  and 
u  are 


e(4)  (k)  =  a(0)  u(k) 

y 


(3.42a) 


e(i4)  (k)  =  u(k) 
u 


(3.42b) 


This  results  in  an  optimal  prediction  error  covariance  matrix 
given  by 


(4) 


a(0) 


a(0) 


a(0) 


R   (0) 

uu 


(3.43) 


with  eigenvalues  of  0  and  l+a(0)  ,   Also  in  the  case  of  the 
system  described  by  Table  3,1  where  a(3)  =  a(4)  =  0,  it  is 
seen  from  equation  (3.41)  that  a  perfect  backward  two  channel 
AR  prediction  of  y(k-4)  is  possible  resulting  in  an  optimal 
backward  prediction  error  covariance  matrix  of 


p(4) 


0 


0 


e{i^4)(k)} 


(3.44) 


T  h  C 


which  also  has  a  zero  eigenvalue. 

Since  the  ellipses  obtained  by  passing  a  plane  through 


the  parabolic  cost  surface  have  axes  whose  half  lengths  given 

ue  • 
(5) 


,      ( 14  )      —  (  4 ) 

by  1/ /XT  and  since  P     and  P     each  have  one  eigenvalue  of 


zero,  it  is  seen  that  the  parabolic  bowls  along  which  K 
and  K    adapt,  degenerate  toward  infinitely  long  parabolic 
troughs  as  the  first  four  stages  converge  toward  their  opti- 
mum values.   This  is  responsible  for  the  slow  convergence  of 
the  overmodeled  parameters  back  toward  zero.   To  avoid  this 
problem,  some  means  of  detecting  this  degeneration  of  the 
cost  surface  and  then  reseting  the  appropriate  parameters 
to  zero  must  be  found  and  this  certainly  provides  an  inter- 
esting area  for  future  study. 

F.   A  LATTICE  APPROACH  FOR  MULTICHANNEL  ARMA  MODELING 

The  lattice  filter  solution  methods  for  the  ARMA  model 
can  readily  be  generalized  to  multiple  input  multiple  output 
ARMA  models  for  linear  systems.   The  equations  for  the 
multichannel  ARMA  model  of  the  system  shown  in  Figure  3.21 
are  developed  in  Appendix  G  with  the  solution  for  the  model 
coefficients  given  by  equation  (G.7)  repeated  here  for 
convenience . 


XI        Y  yr 


5  +    R  +  + 

U  Y      U  U 


B 


A 


-Yv 


I   + 

u  y 


(3,45) 


147 


u  (k) 


u   (k) 


• 

• 

• 

SYSTEM 

• 

• 

• 

■  yi(k) 


Vk) 


Figure   3.21.   A  general  multichannel  ARMA  system 

This  is  clearly  a  generalization  of  the  single  channel  ARMA 
equation  error  solution  given  by  equation  (3,20)  and  just 
as  in  the  single  channel  case  some  assumptions  are  required 
to  apply  a  Levinson  type  algorithm. 

If  it  is  assumed  that  all  the  a..(0)  are  known  or  can 
be  estimated  in  another  manner,  they  can  be  incorporated 
into  a  matrix  given  by 


*0 


a   (0)  ....  a,   (0) 

X 


aQ.l(0K---  aQ.Qn(0) 
xi  xix0 


(3.46) 


and  equation  (3.45)  becomes 


-YY   -YU 


^UY    -UU 


A 


— Yy    — Yu 


^Uy    ^Uu 


"h 


(3.47) 
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This  is  similar  in  form  to  the  equations  for  the  solution  of 
a  (Q  .  +  Qn) channel  autoregression  with  input  channels  y, (k) , 
. . . ,  yn  (k) ,  u, (k)  ,  . , . ,  un  (k)  so  that  the  multichannel 
ARMA  solution  is  related  to  the  multichannel  AR  solution  by 


B 


A 


D 


-4o 


(3.48) 


The  multichannel  AR  prediction  error  vector  is  given  by 


e(k) 


y(k) 
u(k) 


-  [YT  !    uT]  D    = 


-.  T 


e  (k) 

-y 


e  (k) 

— u 


(3.49) 


and  defining 


*  = 


-A, 


(3.50) 


it  follows  that 


<~n  ryi  rri  rc\       I  rp 

e(k)1^  =   y_(kr    -*   uCkKAg    -    [Y1    I     U1  ] 


B 
A 


(3.51) 


B    ^0(k) 
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establishing  the  generalization  of  equation  3.27c  relating 
the  multichannel  AR  prediction  error  vector  to  the  multi- 
channel ARMA  error  vector  e-.(k).   The  ARMA  prediction  error 
covariance  then  is  given  by 


Zq    =  iT  t  t  (3.52) 


and  the  coefficient  matrix  An  can  be  set  to  minimize  the 
trace  of  Pn  resulting  in  a  solution  given  by 

An  =  s{e  (k)  e  (k)T}    e {e  (k)  e  (k)T}  (3.53) 

—0      — u     — u  — u     — y 


completing  the  multichannel  generalization  of  the  single 
channel  results.   The  portion  of  the  solution  given  by  the 
multichannel  autoregression  can  be  solved  as  before  using 
the  lattice  methods  in  either  batch  or  adaptive  fashion. 
Then  the  matrix  of  gains  can  be  obtained  from  equation  (3.53) 
by  batch  processing  or  equations  (3.37)  and  (3.39)  can  be 
generalized  to  yield  an  adaptive  solution  given  by 


An(k+1)  =  An(k)  +  2  -£ —  e  (k)  en(k)T  (3.54a) 

—0         —0  2  , ,  n   —  u     —0 

aQ(k) 


where 


a?(k)  =  (1-a)  ol(k-l)    +  a  e  (k)T  e  (k)  (3,54b) 

0  0  — u      — u 
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It  is  clear  that  the  equations  and  methods  developed  earlier 
for  the  single  channel  ARMA  model  are  a  special  case  of 
these  results  with  Q.  =  Q_  =  1. 
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IV.   NONLINEAR  SYSTEM  MODELING 

The  modeling  of  nonlinear  systems  is  a  far  more  complex 
problem  than  linear  systems  modeling.   No  attempt  is  made 
here  to  provide  a  comprehensive  treatment  of  the  problem. 
Rather,  two  specific  models  for  systems  comprised  of  the 
interconnection  of  linear  and  memoryless  nonlinear  subsys- 
tems are  considered.   Both  of  these  models,  the  Volterra 
model  and  the  new  nonlinear  ARMA  model,  are  shown  to  be 
generalizations  of  the  MA  and  ARMA  modeling  problems  explored 
previously  so  that  with  appropriate  modifications,  the 
Levinson  algorithm  and  lattice  methods  can  be  used  to  solve 
for  the  model  parameters . 

A.   VOLTERRA  NONLINEAR  MODELING 

The  Volterra  series  model  characterizes  nonlinear  sys- 
tems using  a  generalization  of  convolution  where  the  system 
output  is  approximated  as  a  summation  (possibly  infinite) 
of  outputs  of  degree  m  systems. 


M 
y(k)  =  £   ym(k) 
m=l 


This  is  shown  pictorially  in  Figure  4.1. 


(4.1a) 
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Figure  4.1.   The  Volterra  model  for  nonlinear  systems 

A  number  of  representations  for  these  m-th  degree  systems 
are  possible.   The  most  commonly  used  representation  is  in 
terms  of  symmetric  Volterra  kernels  and  is  given  by 


ym<*>  ■  ?••••£ 


n    =0 
m 


n1  =  0 


a    (n-.  . 
s      1 


.n    )u(k-nn ) 
m  1 


. .u(k-n    )       (4.1b) 
m 


and  a  (n, ...n  )  is  the  m-th  degree  symmetric  Volterra  kernel 
s   1    m  &  j 

(Any  permutation  of  the  indicies  results  in  the  same  value 
for  this  kernel  giving  a  high  degree  of  symmetry.) 

This  model  arises  quite  naturally  for  a  linear  system  in 
cascade  with  a  power  series  nonlinearity  as  shown  in  Figure 
M- .  2  for  a  quadratic  nonlinearity  where  the  output  can  be 
written  as 
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00      oo 


y(k)  =   2   S    a(n1)a(n  )u(k-n1)u(k-n2) 


n  =0  n2=0 


and 


(4.2a) 


a  (n, n0 )  =  a(n, )a(n„) 
s   1  l  1     2 


(4.2b) 


u(k) 


y(k) 


Figure  4.2.   A  quadratic  nonlinear  system. 

The  Volterra  series  model  in  this  form  has  been  widely  dis- 
cussed in  the  literature  [e.g.  Ref.  1,  6,  13,  14,  25,  26,  48 
and  54]  since  Wiener  [Ref.  62]  first  applied  it  to  systems 
analysis  and  modeling.   It  has  the  added  benefit  of  treating 
linear  systems  as  a  special  case  of  the  model  since  the 
first  degree  kernel  is   exactly  the  convolutional  represen- 
tation of  the  MA  model. 

The  number  of  kernels  (M)  required  for  an  accurate  model 
depends  on  the  nature  of  the  nonlinearity  in  the  system  and  as 
long  as  the  nonlinearity  is  soft,  a  relatively  low  degree 
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model  will  suffice,  keeping  the  problem  manageable  in  that 
regard.   The  primary  difficulty  associated  with  Volterra 
nonlinear  modeling  arises  from  the  fact  that  it  uses  a 
nonrecursive  MA  representation  for  the  linear  portion  of 
the  system,  requiring  in  general,  an  infinite  memory  as 
indicated  by  the  upper  limits  on  the  summations  in  equation 
(4.1b).   In  practice,  these  summations  need  to  be  truncated 
as  shown  in  equation  (4.3) 

N      N 

y  (k)  =  Y    . . . T        a  (n, ...n  )u(k-n, ) . . . u(k-n  )   (4.3) 
■'m       *-»     *-*>n         s   1     m      1         m 
n  =0   n  =0 

1      m 

but  a  large  number  of  terms  may  still  be  required  to 
accurately  model  the  system.   One  method  of  solving  for  the 
model  is  to  set  the  parameters  (terms  of  the  Volterra  kernels) 
to  obtain  a  minimum  mean  square  equation  error  where  the 
equation  error  is  defined  as  the  difference  between  the  sys- 
tern  output  and  y(k) . 

To  simplify  the  solution  and  reduce  the  number  of  para- 
meters that  must  be  obtained,  the  symmetry  of  the  kernels  can 
be  exploited  by  rewriting  equation  (4,1b)  as 


ym<k)  =  12n      E   '"_?_     at(nx,  .  .nm)u(k-n1)  .  .  ,u(k-nm) 

(4.4) 


n  =0  n0=n,    n  =n 
1     2   1     m   m-1 


where  a^Cn, . . .n  )  is  the  m-th  degree  triangular  Volterra 
t   1    m 

kernel.   For  a  finite  upper  limit  of  N  on  the  summations 
the  m-th  degree  symmetric  kernel  will  contain  (N+l)   terms 
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but  only  —m r  of  them  are  distinct  with  the  remainder  de- 

3         N !  m! 

termined  by  symmetry  considerations. 

Still  another  representation  for  y  (k)  uses  the  regular 

r  m  ° 

form  of  the  Volterra  kernel  (this  terminology  has  recently 
been  introduced  by  Mitzel,  Clancy  and  Rugh  [Refs.  7  and  35]). 
It  is  given  by 


y  (k)  =  T    ...  T     a  (h,...h  )u(k-h1 )u(k-h, -h0) 

Jm  u   n    v.'^n   rim       1       12 

h, =0    h  =0 
1      m 


u(k-h,-. . .-h  )  (4.5) 

1      m 


where  a  (h, ...h  )  is  the  m-th  degree  regular  Volterra  kernel, 
rim  &       s 

With  infinite  upper  limits  on  the  summations,  the  symmetric, 
triangular  and  regular  forms  of  the  Volterra  kernels  are 
equivalent,  however,  when  finite  upper  limits  are  used  they 
cover  the  field  of  the  model  kernels  in  different  ways. 

Because  of  its  symmetry,  it  is  reasonable  to  have  equal 
upper  limits  on  all  the  summations  on  the  symmetric  kernel 
as  was  done  in  equation  (4.3).   This  is  shown  in  Figure  4.3a 
for  a  second  degree  kernel.   The  equivalent  triangular  kernel 
is  given  by 

N      N 

y  (k)  =  T    ...  T  a„.(n,  ,..n  )u(k-n1  )  ,  .  .u(k-n  )    (4,6a) 

Jm  *-*  <-*  t   1    m      1         m 

n,  =0    n  =n  -. 
1      m  m-1 

and  it  covers  the  region  shown  in  Figure  4.3b  for  a  second 
degree  case.   The  corresponding  regular  form  expansion,  however , 
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requires  variable  upper  limits  on  the  summations  to  cover 
the  equivalent  kernel  space  as  shown  in  equation  (4.6b),  and 
Figure  4.3c  for  a  second  degree  kernel. 

N    N-h,  N-h   , 

V  (k)  =  T       T   .-  -Y   m"   a  (h, ...h  )u(k-hn  )u(k-h1-h0) 

V  h~=0  h>0  h^o     r  1         m     1  1     2 

12m 


, ,u(k-h,-. . .-h  )  (4.6b) 

1      m 


Thus  it  is  seen  that  only  half  the  field  needs  to  be  covered 
by  the  triangular  and  regular  expansions  to  identify  the 
kernel  associated  with  the  square  field  of  the  regular  ex- 
pansion.  When  the  regular  form  expansion  is  used  with  con- 
stant finite  upper  limits  there  is  no  inherent  reason  to 
make  all  the  upper  limits  equal  since  there  is  no  symmetry 
in  the  kernel.   Considering  the  regular  expansion 

N,     N 

1     m 


y  (k)  =  y     .     J]        a  (h,  ...h  )u(k-h, ) . . .  (u(k-h, -. . .-h  ) 
Jm  ,_  *rfn      ^~n        r   1    m      1  1      m 

(4.7a) 


h  =0   h  =0 
1     m 


a  triangular  expansion  given  by 


Nn      n, +I\U   h      ,  +N 


**1      ""L,"2    'ifel      m 

y«(k)    =     /.        2-»   •  •  •    2-»  a.  (nn  .  .  .n    )u(k-n, ) , . . u(k-n    ) 

7  2  ~.  t      1  m  1  m 

n, =0    n0=n,    n    =n      , 
1  2      1m     m-1 

(4,7b) 

is  required  to  cover  the  corresponding  field.   For  a  quadratic 
expansion  this  is  shown  in  Figures  4.4c  and  4.4b,   The  equi- 
valent region  in  the  symmetric  field  is  shown  in  Figure  4.2a. 
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Therefore,  identifying  a  rectangular  kernel  in  the  regular 
kernel  space  is  equivalent  to  obtaining  a  symmetric  kernel 
in  the  arrow  shaped  region  of  Figure  4.4a. 

Which  type  of  expansion  is  more  appropriate  for  a  given 
system  depends  on  the  shape  of  the  kernels  for  that  system. 
For  example,  if  a  quadratic  nonlinear  system  has  a  kernel 
with  a  relatively  square  shape  in  the  symmetric  kernel 
space,  a  regular  form  expansion  with  constant  upper  summa- 
tion limits  will  have  to  estimate  many  zero  valued  terms 
and  is  inefficient.   On  the  other  hand  if  the  system  has  a 
kernel  similar  to  the  arrow  shaped  region  of  Figure  4.4a 
in  the  symmetric  space,  a  regular  expansion  will  be  efficient 
while  a  symmetric  expansion  over  a  larger  field  would  be 
required  with  many  zero  valued  parameters .   Not  enough  is 
known  about  how  to  relate  types  of  nonlinear  systems  with 
various  shaped  kernels  however,  and  the  question  of  kernel 
shape  and  the  best  type  of  expansion  is  not  pursued  further 
here  . 

1 .   Lattice  Filter  Methods  For  Volterra  Modeling 

Lattice  filter  methods  derived  earlier  can  be  applied 
to  obtain  a  minimum  mean  square  equation  error  solution  for 
the  Volterra  model  if  the  regular  form  of  the  expansion  is 
used.   The  Volterra  model  can  be  put  in  the  form  of  a  mul- 
tiple input  single  output  MA  model  by  defining  a  new  family 
of  signals  as  nonlinear  combinations  of  delayed  values  of 
the  system  input  u(k) . 
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un  .  .  .,  (k)=u(k)  u(k-h0)  u(k-h0-h0 ) , . .u(k-h0- . . , -h  ) 
h0    h  2        I       3  2m 

2     m 


(4.8) 


For  finite  summations  the  regular  form  of  the  expansion 
becomes 

mm       m2  I   ml 

y  (k)  =  T  ...  y    \     y       a  (h,  .  .  .h  )u,   .  .  .,  (k-h, ) 
m       u-n  u^-nlu-n         r  1  m   h0    h      1 

h  -0      ho-0(  h, -0  2     m 

m         2   11 

(4.9) 

and  can  be   regarded  as  the  sum  of  the  outputs  of  a  large 

number  of  linear  filters  whose  inputs  are  given  by  u,   . ..,  (k) 

2     m 
Equation  (M-.9)  is  exactly   the   form  of  a  Q .  input,  single 

output  MA  model  where 


Q.  =  (N  0+l)(N  a+l). ..(N   +1)  (4.10) 

xi     m2      m3         mm 

Furthermore,  if  the  same  upper  limit  on  the  summations  over 

h,  is  used  in  each  of  the  various  m-th  degree  systems , 

(N-.  ,  =N„,  =  .  .  •  =N  ,),  the  overall  Volterra  model  given  by 
11   21      ml  ° 

equations  (4.1a)  and  (4.9)   is  in  a  form  suitable  for  solu- 
tion via  the  multichannel  Levinson  algorithm  or  the  multi- 
channel MA  lattice  methods.   The  requirement  to  use  the  same 
upper  limit  on  all  summations  over  h,  arises  because  the 
Levinson  algorithm  and  lattice  methods  assume  the  same  amount 
of  memory  in  each  channel  of  the  model. 

As  a  specific  example  consider  a  second  degree  ex- 
pansion where  N1i=M2l=Nl  and  N21=N2* 
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Nl  N2   Nl 


y 


(k)  =   E   a  (h  )u(k-h  )  +   E   E    ar(hlh2)uh  (k"hi} 


h1=0  h2=0  h1=0 


(4.11a) 


u,  (k)  =  u(k)u(k-h0)  (4.11b) 

ru  2 


Defining  data  and  coefficient  vectors  for  each  channel  given 
by 


u*  (k)  =  [u,  (k)...u,  (k-N,)]T  (4.12a) 

— ru        n«       h^     1 


a*      =  [a  (0,ho)...a  (N, ,  h0)]T  (4.12b) 

— hy  r     2     r   1 '   2 


and  embedding  them  into  single  data  and  coefficient  vectors 
written  as 


T 
X  (k)  =  [u  (k)1    u  (k)1  ,   ...    u,T  (k)1]     (4.12c) 
—         -      I  -°      I      ,  "N2 


d      =  [a   I  aQ   |  . ..  |  aN   ]  (4.12d) 

1       i      i    2 


the  equation  for  the  Volterra  model  output  becomes 

A         V+     T   + 

y(k)  =   A  (k)   d  (4.12e) 
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which  is  clearly  of  the  same  form  as  equation  (A. 24a)  and 
represents  a  N  +2  input,  single  output  MA  model.   All  the 
nonlinearities  in  the  Volterra  model  are  external  to  this 
MA  model,  in  the  formation  of  its  various  input  signals  from 
the  system  input  u(k) .   This  model  is  illustrated  in  Figure 
4.5. 

It  is  interesting  to  consider  what  the  recursive  in 
order  nature  of  the  Levinson  algorithm  or  lattice  methods 
mean  to  the  nonlinear  Volterra  problem.   In  building  up  the 
MA  model  solution  recursively  in  order,  the  upper  limit  of 
the  summations  over  h,  is  increased  until  the  desired  value 
is  reached.   In  terms  of  the  Volterra  model  kernels,  this 
means  allowing  each  of  the  regular  form  kernels  to  grow  in 
size  in  the  h,  dimension  while  holding  their  boundaries  fixed 
at  prespecified  values  in  all  the  other  dimensions.   In  the 
linear  MA  model,  the  kernel  has  only  one  dimension  and  there- 
fore the  recursive  in  order  solution  eliminates  any  require- 
ment to  prespecify  its  upper  limit  (the  order  of  the  model). 
For  the  higher  degree  nonlinear  kernels,  these-  methods  reduce 
by  one  the  number  of  kernel  boundaries  that  must  be  pre- 
specified for  each  kernel.   Allowing  the  kernel  to  grow  in 
any  of  the  other  M-l  dimensions  (h~  through  h  )  corresponds 
to  adding  additional  channels  to  an  existing  lattice  and 
simple  methods  of  accomplishing  this  are  not  available. 
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u(k) 


4- 


(2) — *  y(k) 


Figure  4.5.   A  second  degree  nonlinear  Volterra 


model  in  multichannel  MA  form 
The  a^   represents  are  coeffi- 
cients of  the  single  input  sinsl 


output  MA  models  shown. 


gle 
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B.   NONLINEAR  ARMA  MODELING 

As  was  previously  mentioned,  the  primary  difficulty 
associated  with  the  Volterra  series  model  arises  from  the 
fact  that  it  is  a  nonlinear  generalization  of  the  MA  model 
and  as  such,  a  large  number  of  terms  may  be  required  to 
accurately  represent  even  a  mildly  nonlinear  system.   In 
linear  system  modeling  this  difficulty  was  remedied  by 
using  the  more  general  ARMA  model.   It  is  reasonable  to 
assume  therefore  that  a  nonlinear  generalization  of  the 
ARMA  model  could  remedy  the  problem  in  the  nonlinear 
modeling  case  (for  at  least  certain  types  of  nonlinear 
systems).   Such  a  generalization  called  the  nonlinear  ARMA 
model  has  recently  been  proposed.  [Ref .  64]   This  model 
forms  an  estimate  of  the  current  value  of  the  system  output 
as  follows: 


OO  CO 


(k)    =     T]     a    (n,  )u(k-n, )    +    lL      -^      a    (n, n0 ) u(k-n, )u(k-n0 ) 

n  Si  1  „  _«  S  1      2  1  I 


n    =0      °      x  "-        n  =0  n='( 


...+    T]   ...    J2  a    (n. . . n_)u(k-n, ) . . .u(k-n    ) 

n  -n         S         1  P  1  p 

n, =0       n    =0  r 

1  P 


CO  CO 


♦  E 

m 


i       b    (m, )y(k-m    )+     J]       L      b    (m      m? )y (k-m, )y (k-m? ) 
=  1  m-,  -  i  m_  =  1 


1  2 


CO  CO 


. . .+    z2  ,  .  /^       b    (m,  ,  . .m    )y(k-m,  ) . . ,y(k- 
ttSI  q  1 


m    ) 


m,  =1  m    =1 
1        q 
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CO  CO 


+    Yi    £       C(n1m1  )u(k-n,  )y(k-jn,  )  +  .  .  .+    £...£...£... 

nL  =  0m1  =  l  n1=0  n  =0  m^l 


CO 

2^     C(n, ...n   m, . . . m    )u(k-n, ) . .u(k-n    )y(k-m,) 
tmt1  1  p   1  q  1  p^l 


m 

q 


. . .y(k-M    )  (4.13) 

y        q 


The  first  three  terms  of  equation  (4.13)  are  a  discrete 
Volterra  expansion  of  the  input  signal  u(k)  and  represent 
F, nCu(k)]  in  terms  of  the  discussion  of  Chapter  I.   The 
second  three  terms  are  a  discrete  Volterra  expansion  on  the 
system  output  delayed  one  sample  interval  and  represent 
F-Q[y(k-1)3.   The  final  two  terms  are  bivariate  expansions 
of  the  system  input  and  delayed  output  and  represent 
F  n[u(k),  y(k-l)].   (This  is  the  first  and  only  model  con- 
sidered where  F^C*]  is  not  assumed  to  be  zero.)   Equation 
(4.13)  is  clearly  a  nonlinear  extension  of  the  linear  ARMA 
model  contained  in  the  first  and  fourth  terms.   As  was  the 
case  in  the  Volterra  model) the  number  of  multiple  summations 
required  in  (4.13)  is  dependent  upon  the  nature  of  the  non- 
linearity  in  the  system  being  modeled.   The  upper  limits  on 
the  summations  in  (4.13)  have  been  written  as  infinity  in- 
dicating a  requirement  for  infinite  memory  or  model  order. 
As  is  discussed  subsequently,  however ,  the  required  model 
order  (memory)  may  in  fact  be  finite  due  to  the  nature  of 
the  system  being  modeled,  thereby  alleviating  the  difficulty 
encountered  in  the  Volterra  nonlinear  model  previously  presented 


The  kernels  of  the  input  expansion  and  output  expansion 

a  (•)  and  b  (*)  are  symmetric  since  any  permutation  of  the 
s  s 

indicies  results  in  the  same  value  for  the  kernel.  They 
have  therefore  been  labeled  with  a  subscript  "s"  and  can 
also  be  written  in  triangular  and  regular  form. 

oo  oo 


Y\...  ./]      a    (n,  ,  .  .n    )u(k-n,  )  .  .  .u(k-n    ) 
_n    „     -n       Sip  1  p 


nn  =0    n    =0 

1  P 


2j     .  .  .  2]  a.  (n- . .  .n    )u(k-n,  ).  .  .u(k-n    ) 

_  _  _    _  _  t      1  p  1  p 


n    =0      n0  =  n,  n    =n 

1  2      1  p      p-1 


Y,        a    (h, ...h    )u(k-h    )u(k-h   -h9) 


h    =0  h    =0 

1  P 


r^^1-  •  .»    ,-*~    i^,-^    ^    u2 


u(k-h,-. . .h    )  (4.14) 

1  P 


ml 


2       •  ••       2      b    (ni,  ...m    )  y(k-m,  )  .  .  .  y  (k-m    ) 
=  1  m    =1      S      l  q  L  q 


2  2        ...  2  b    (m1...m    )y(k-l-m1).  ..(k-l-m    ) 

m    =0      nu=m-.  m    =m      ,  q  4 

1  2      1  q      q-1 


2       •••       2      b^(h1 ...h    )y(k-l-h    )y(k-l-h    -h    ) 
h.=0  h    =0      r      x  q  L  x 

1  q 


.  .  .yCk-l-h-,-.  .  .-h    )  (4.15) 

1  q 
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In  writing  the  triangular  and  regular  forms  in  equation  (4.15) 
the  lower  index  on  the  summations  has  been  shifted  to  zero. 

In  the  case  of  the  bivariate  expansion  terms  in  equation 
(4.13)  the  kernels  do  not  possess  symmetry  so  that  trian- 
gular expansions  are  not  possible  but  regular  form  expansions 
are  possible. 


2   •••   2     2   •••   2   C(n  . . .n  m  . . .m  )u(k-n  ) 
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1        pi        q 
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00 
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1   1     p+q       1  1       p 


yCk-l-h,-. . .-h  ., ) . . .yCk-l-h,-. . .-h  .  > 
J  i     p+i   J  i     p+q) 


2   cr2(hr  •  •hp+q)y(^-l-h1)  •  •  .y(k-l-hf...-h) 


h  =0     n  .  -o  ^ 

1      p+q 


u(k-h,-...-h  , , ) . . .u(k-h, - . . . -h    ) 
1       q  +  1  I       p+q 


(4.16) 


Thus  two  regular  forms  are  possible  for  the  bivariate  expan- 
sions. Figures  4.6  and  4.7  illustrate  the  manner  in  which 
the  regular  form  of  the  bivariate  expansion  covers  the  field 
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of  model  kernels  for  a  second  degree  case  for  finite  upper 
limits  of  N,  and  N„  on  the  summations .   It  is  interesting 
to  note  that  because  of  a  lack  of  symmetry  in  the  original 
form  of  the  bivariate  expansion,  the  causal  region  in  the 
regular  form  extends  outside  the  quarter  plane. 

As  was  the  case  with  the  Volterra  model,  it  will  be 
shown  that  in  regular  form,  a  minimum  mean  square  equation 
error  solution  can  be  obtained  for  the  nonlinear  ARMA  model 
coefficients  using  either  the  Levinson  algorithm  or  the 
lattice  methods.   This  will  provide  the  nonlinear  generali- 
zation of  the  results  presented  in  Chapter  III  on  linear 
ARMA  modeling.   Before  developing  this  method  of  solution, 
however,  the  applicability  of  the  nonlinear  ARMA  model  to 
various  types  of  systems  and  its  memory  requirements  will 
be  considered. 

1 .   Identif iability  Conditions  and  Memory  Requirements 
In  the  previous  chapter  on  linear  ARMA  modeling  it 
was  stated  that  there  were  two  facets  to  the  question  of 
model  identif iability ;    input  signal  requirements  and  mea- 
surement requirements.   The  question  of  measurement  re- 
quirements was  not  discussed,  however ,  since  it  was  assumed 
that  all  signals  (input  and  output)  were  observed.   In  the 
study  of  systems  comprised  of  interconnected  linear  and 
nonlinear  subsystems,  however ,  various  internal  signals  exist 
and  the  effect  of  either  observing  or  not  observing  them  on 
the  modeling  process  must  be  explored.   To  do  so,  it  will 
be  assumed  that  the  system  under  study  can  be  put  into  the 
form  of  Figure  4.8  fulfilling  the  following  equations. 
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x-Ck)  =  l±   YN(k)  +  C^dc)  (4.17a) 


^Ck)  =  r2  yL(k)  +  c2u(k)  (4.i7b) 


yM(k)  =  F  [x  (k)]  (4.17c) 

^■N      —  — n 


yL(k)  =  T  [x  (x)]  (4.17d) 


where 


xL(k)  =  [xL1(k)    ...    xLp(k)]T  (4.18a) 

a  vector  of  inputs  to  P  linear  subsystems; 

xN(k)  =  CxN1(k)    ...    x   (k)]T  (4.18b) 

a  vector  of  inputs  to  Q  nonlinear  memoryless  subsystems; 

yL(k)  =  CyL1(k)    ...    yLp(k)]T  (4.18c) 

a  vector  of  outputs  from  P  linear  subsystems; 


y^(k)  =  CyN1(k)    ..,   yNQCk)3  (4.i8d) 

a  vector  of  outputs  from  Q  nonlinear  memoryless 
subsystems , 
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u(k) 


y, 

■f 

xL(k) 

«•] 

yL(k) 

fxN(k) 

P    Linear    Sys- 
tems 

•r, 
■fi 

-• 

■y 

\ 

c2 

F[«] 

\  %<k) 

Q    Nonlinear 
Systems 

Figure  4.8.   A  General  Nonlinear  System 
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The  z  -  transforms  of  these  signals  are  also  defined  as 
XL(z),  X  (z),  Yt(z),  Xn(z).   £-,  J_2>  £1  and  C2  are  matrices 
whose  elements  are  either  0,  -1  or  +1  indicating  the 
interconnections  of  the  various  subsystems.   T[  .  ]  and  F[ . ] 
are  diagonal  matrices   specifing  the  linear  and  nonlinear 
subsystems  as  follows. 


YT  (z)    =    T(z)    XT  (z) 


(4.19a) 


where 


T(z)    =    [T. .(z)] 


A.(z) 

l 

l-B.(z) 

l 


i  =  ] 


i*j 


(4.19b) 


and 


A. ( z)    =       y\     a. (n)    z 

n=  0      X 


-n 


(4.19c) 


B.  (z)    =        5*     b. (n)    z 

n=l 


-n 


(4.19d) 


yN<k>  =  I^xN(k)]  = 


fx[.] 


F2C3 


F   [.] 


X(k) 

xM;(k) 
N2 


XNQ(k) 


Fl[xN1(k)] 


FQCxNQ(k)] 


(4.20) 


This  overall  system  given  by  Figure  4.8  is  adequate  to  re- 
present a  broad  class  of  systems  comprised  of  intercon- 
nections of  linear  and  nonlinear  subsystems  including 
cascades,  parallel  connections  and  feedback  systems. 
Equations  (4.19)  and  (4.20)  assume  all  the  subsystems 
are  single  input  single  output  noninteracting  systems.   If 
desired,  the  collection  of  linear  subsystems  given  by 
T(z)  can  readily  be  put  into  the  general  multichannel  ARMA 
form  of  Section  III.F  to  allow   each  output  to  be  a  function 
of  past  values  of  all  outputs ,  and  past  and  present  values 
of  all  inputs. 

Alternate  representations  of  these  linear  and  non- 
linear subsystems  are  also  useful.   Equations  (4.19)  are 
equivalent  to 


YT(z)  =  CA.(z)]XT(z)  +  [B.(z)]YT (z)  (4.21a) 

— Li  1        — Li  1  1-J 

or  in  the  time  domain 

yL(k)  =  [a.(k)]*xL(k)  +  Cb.  (k)  ]*y_L(k)  (4.21b) 

Here  *  represents  convolution  and  is  carried  out  in  the 

same  sense  as  matrix  multiplication  and  the  matrices- 

Ca.(k)],  Cb.(k)],  CA.(z)]  and  [B.(z)]  are  diagonal  matrices 

whose  i-th  entries  are  the  time  domain  functions  a.(k)  or 

b.(k)  or  the  polynomials  A.(z)  or  B.(z).   The  time  domain 
X  v       3  1  l 

functions  a.(k)  and  b.(k)  are  the  inverse  transforms  of  the 

i         i 

polynomials  A.(z)  and  B.(z).   This  can  also  be  represented 
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in  nonrecursive  form  as 


yT (k)  =  [h.(k)]*xT (x) 


(4. 22a) 


where  [h.(k)]  is  a  diagonal  matrix  of  impulse  responses 
defined  by 


h.(k)  = 

i 


n=0 


h. (n)  5  (k-n) 

i 


(4.22b) 


The  nonlinear  systems  can  also  be  represented  in  terms  of 
inverse  functions  assuming  they  exist  over  the  necessary 
ranges  of  the  variables  so  that 


*N<k>  =  r1C^JCk)]  = 


F~x[-] 


0 


*F"1['  ] 
rQ  L  J 


^Nl(k) 


yNo(k) 


Fl1[^Nl(k)] 


FQ1[yNQ(k)] 


(4.23) 


So  that  equations  (4.17)  are  iterative  it  is 
necessary  that  no  delay  free  loops  exist  in  the  system  of 
Figure  4.8,   A  necessary  and  sufficient  condition  for  the 
absence  of  delay  free  loops  is  developed  in  Appendix  H  and 
requires  that  the  terms  of  the  determinant  of  the  matrix 
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a  =  Lim   £2  T(z)  l± 


(4,24) 


z^-°° 


and  all  its  principal  minors  must  be  zero. 

The  various  signal  vectors  and  equations  (4.17)  can 
now  be  partitioned  as 


x'(k) 


x;'(k) 

— Li 


x'Ck) 

— N 


*!}<*> 


"^(kf 


y^k) 


Ila 


lie 


T-2c 


-lb 


U.A 


r  r 

-2a        -2b 


-2d 


#(W 


XS<*> 


v-(k) 


y£(k) 


—la 


=ib 


!aC-]        ^[.71 


F,[.]         F,[.] 

— c  — d 


^a 

+ 

=  2b 



*N<*5" 


2S(K) 


u(k) 


u(k) 


(4.25a) 


(4.25b) 


(4.25c) 


i£(kr 


y£(k) 


"T    [.]        T-C.J1 
—a  — b 


T    [.]         T\[.] 
— c  — d 


x'Ckf 


x£(k) 


(4.25d) 


Equation  (4.25c)  can  also  be  written  in  terms  of  inverse 
functions  as  in  equation  (4.23),  and  (4.25d)  can  be  written 
using  either  recursive  or  nonrecursive  representations  of 
the  linear  systems  as  was  done  in  (4,21)  and  (4,22).   The 
single  primed  signal  vectors  in  equations  (4.25)  represent 
those  signals  which  are  observed  and  the  double  primed  signals 
are  those  which  are  not.   It  is  assumed  that  all  input  signals 
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in  u(k)  are  observed.   It  is  possible  to  rewrite  equations 
(4.25)  as  a  single  composite  matrix  equation  as  done  in 
(4. 26a).  As  written,  equation  (4.26a)  is  an  infinite  memory  or 
model  order  representation  because  of  the  presence  of  the 
h(k)*  operators  but  a  finite  memory  version  can  be  written 
by  expressing  rows  4  and  8  as 


y/(k)  =  a  Ck)*x'  (k)  +  b  (k)*y»  (k)  +  aK(k)*x"(k) 

— L      —a    — L      —a    —  L      — b     — L 


+  b,  (k)*y"(k)  (4.26b) 

— b    — L 


y!'(k)  =  a  (k)*xT'(k)  +  b  (k)yT' (k)  +  a,(k)*x"(k) 

— L       — c     — L       — c    — L       — d     — L 


+  b^(k)*y£(k)  (4.26c) 


The  third  and  seventh  rows  can  also  be  written  in  terms  of 
inverse  functions  if  desired  as 


x^(k)    =    F^Cy^k)]    +    Fj^CyjJCk)]  (4.25d) 


x"(k)    =    F~1[y'(k)]    +    F_:1[y;'T(k)]  (4.26e) 

— N  — c      JN  — d      — N 


Now  the  problem  of  writing  a  system  of  iterative  equations 
for  the  observed  signals  in  terms  of  only  observed  signals 
consists  of  rewriting  equation  (4.26a)  so  that  the  upper 
right  4  by  5  partition  is  a  null  matrix.   If  this  can  be 
done,  the  general  form  of  the  nonlinear  ARMA  model  can  be 
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used  to  identify  the  composite  effects  of  the  operators 
appearing  in  the  upper  left  5  by  5  partition.   In  some  cases 
this  will  only  be  possible  of  the  infinite  memory  version  of 
the  model  (with  h(k):':  operators)  is  used  and  in  other  cases 
a  finite  memory  model  will  suffice, 

The  process  of  determining  whether  infinite  or  finite 
memory  nonlinear  ARMA  models  can  be  used  to  identify  a  given 
system  is  illustrated  for  two  examples  in  Appendix  I.   First 
a  system  consisting  of  a  cascade  of  linear  and  nonlinear 
systems  is  considered.   Then  a  model  of  the  tracking  behavior 
of  a  phase  locked  loop  is  put  in  the  form  of  a  nonlinear  ARMA 
representation . 

2 .   Lattice  Filter  Methods  for  the  Nonlinear  ARMA  'Model 

As  was  the  case  with  Volterra  modeling,  lattice 
filter  methods  can  be  applied  to  the  nonlinear  ARMA  modeling 
problem  if  the  regular  form  of  the  kernels  is  used.   A 
family  of  signals  is  defined  as  nonlinear  combinations  of 
delayed  values  of  y(k)  and  u(k)  as  follows. 


u,     v,  (k)  =  u(k)  u(k-h0)u(k-h0-h„).  .  .u(k-h0-.  .  .-h  ) 
h0 . . . n  l  l      3         l  m 

2     m 


(4.27a) 


yh    h  (k)  =  y(k-l)y(k-l-h2)y(k-l-h2-h3) 
2 "    m 


. , .y(k-l-h0-. . .-h  )  (4.27b) 

2       m 
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ll     .   y.       .    (k)  =  u(k)  u(k-h0)  .  .  .u(k-h0-.  .  . -h  ) 
h0...h  J  h  ,-.  .  .  .n  .  2         2      p 

2     p    p+1     p+q  ^ 

y(k-l-h2-. . --hp+1) 

. . .y(k-l-h,-. . .-h    )   (4.27c) 

y  i     p+q 

or  alternately 


yh2...hq  Uhq+1...hq+p  =  y(k~l)y(k-l-h2)...y(k-l-h2-...-h  ) 


u(k-h0- . . . -h  ,.,)..  .u(k-h0- ... -h  .  >, 
I  q+1  2       q+p) 


(4.27d) 

With  finite  summations,  the  regular  form  of  equation  (4.14) 
becomes 

N  N      „    [    N      , 

P>P  P>2    I       p,l 

T        •••         V  T  a   (h,  ...h   )   u,  .     (k-h, ) 

£-t  Zj      \       ju  rip        h9...h  1 

h   =0  h   =0        h    =0  l  p 

P  2         (       1 

(4.28a) 

Equation  (4,15)  becomes 


m  m    0      in    -, 

q>q  q32  q»i 

2     •••      2  2      Vhr'V  yh9...h  (k-hi} 

h    =0  ho  =  0        /  h,=0  q 

q  2  i 

(4.28b) 
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and    equations    (4.16)    become 


p+q, p+q 

2 


Jp  +  q52 

2 

h2  =  0 


L 


p+q,l 


h1  =  0 


C    ,  (h,  .  .  .  h         )    u,  , 

rl      1  p+q        h0. . .h 


or 


Jh    ,-,  .  ,  .h    ,  1 

p+1  p+q 


(4.28c) 


q+p,q+p 

2 

Vp=0 


Jq+p,2 

2 

h2  =  0 


L 


q+p,l 


h1  =  0 


C    0(h-. 
r2      1 


u,  .         (k-h,  ) 

h    , , . . .h    ,  1 

q+1  q+p 


.h    ,     )    y, 

a  +  p      J] 


.  .h 


q 


(4.28d) 


These  terms  can  be  viewed  as  the  summation  of  the  outputs 
of  a  large  number  of  linear  filters  whose  inputs  are  the 
signals  defined  in  equations  (4.27).   In  the  context  of 
multichannel  filtering,  each  of  these  filters  can  be  con- 
sidered as  a  single  channel  and  each  of  the  input  signals 
can  be  associated  with  one  of  the  channels.   Since  the 
lattice  models  use  the  same  amount  of  memory  in  each  channel, 
the  upper  limits  on  all  summations  over  h..  will  be  made  the 

same  (N   .  =  N   ,  =  L     ,  =  N-.  )  .   The  upper  limits  on  the 
p,l     q,l     p+q,l     1         *v 

summations  over  the  other  indicies  determine  the  number  of 

channels  required. 
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For  a  quadratic  nonlinear  system,  the  present  value 

of  the  system  output  is  estimated  as 

Nl  N22  Nl 

y(k)      Y      a  (h-,)u(k-h,)  +    V  V   a  (hn,h0)u^  (k-h, ) 

J  *-*        r  1      1      *-*  **       r  1  2  h9    1 

h1=0  h2=0  h1=0  l 

Nl  M22  Nl 

+   X   br(h1)y(k-h1)  +   2  2   br(hl'h2)yh  (k"h^ } 

h,=l  h2=0  h1=0  2 


L22     Nx 

E     E   Cr  (hlSh2)  u  yh  (k-hx)       (4.29) 
h2=0    h1=0    X  2 


where  uh  (k)  =  u(k)u(k-h2)  ,  yh  (k)  =  y (k-1 )y (k-l-h2 ) 

and  u  y   (k)  =  u(k) y(k-l-h2 ) .   Signal  and  coefficient  vectors 
can  be  defined  for  the  various  quantities  in  equation  (4.29) 
following  the  conventions  established  earlier;  for  example 


y(k)  =  Cy(k-l)  ...  y(k-N,)]  (4.30a) 

+  T 

yh  (k)  =  [y   (k)  ...  yh  (k-^)]  (4.30b) 

and 

T 

b  =  [b  (1)  . . .  b  (N, )]  (4.30c) 

—     r         r   1 

+  T 

b,   =  [b  (0,k  )  ...  b  (N,,h0)]  (4.30d) 

— h  „      r     2        r   1   2 
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Embedding  all  these  vectors  into  single  data  and  coefficient 
vectors,  equation  (4.29)  becomes 


y(k)  =  _Xx(k)   d 


(4.31a) 


where 


X^Ck) 


T      +     T 

Cy(k)1  1  u  (k)1 


yn(k)x  , 


+      T 
22 


u0(k)X  | 

i 


,   w22 


and 


ii 


=  [b' 


+    T 


tT  i 


^0   ! 


T 


^0   I 


— M 


^N 


+ 


I  uyT   (k)1] 
— L2  2 


T 
22 

T 
22 

T 
22 


(4.31b) 


(4.31c) 


The  minimum  mean  square  equation  error  solution  for  the 
model  coefficient  vector  d,  is  given  by 


e{  X1(k)  Xx(k)T}  dx  =  eCX^Ck)  y(k)} 


(4.32) 


where  the  equation  error  is  defined  as 


e(k)  =  y(k)  -  y(k) 
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While  equation  (4.32)  is  similar  in  form  to  equations 
(A. 7a)  and  (A. 26a)  for  AR  and  MA  models,  it  lacks  the  nec- 
essary form  for  the  application  of  the  Levinson  algorithm 
because  of  the  fact  that  some  component  data  and  coefficient 
vectors  are  N, -vectors  while  others  are  (N  +1) -vectors. 
This  is  the  same  problem  that  arose  in  linear  ARMA  modeling 
and  can  be  handled  here  in  a  similar  manner.  Defining 


if;    =    Cl  -a    (0)  -    b    (0,0)  ...  -b    (0,Moo) 

—  r  r  r  22 


a    (0,0)  ...  -a    (0,Noo) 

r  r  22 


c    . (0,0)  ...  -c    , (0,Loo)] 

rl  rl  22 


(4. 33a) 


and 


x(k)    =    Cy(k)  u(k)  Yn(k)  •••  Vy\      (k) 

u  ii22 


uQ(k)  ...  u„      (k) 


uyQ(k)         . . .  uy.       (k) ] 


22 

T 
(k)] 
22 

(4.33b) 


and  assuming  that  the  coefficients  in  i£  can  be  estimated  in 
some  other  manner,  the  MMSE  solution  for  the  remaining 
model  coefficients  becomes 
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£(X  (k)  X(^)  >  d  =  £{_X<k>  x(k)T}  ty  (4.34) 

Here  X(k)  and  d  are  defined  as  X,  (k)  and  d..  with  all 
superscripts "+"  removed  from  their  component  vectors  (indi- 
cating they  are  all  indexed  from  1  to  N..  and  are  all  N  - 
vectors).   Note  that  just  as  in  the  linear  AR'MA  case,  the 
coefficients  in  ty_   essentially  correspond  to  gains  on  their 
respective  channels.   Comparing  equation  (4.34)  with  (A.  7a) 
for  a  multichannel  autoregression  it  is  clear  that  d  can  be 
obtained  from  the  multichannel  AR  solution  by 

d  =  _D  ±  (4.35) 

where  the  signals  in  x(k)  comprise  the  channels  in  the  auto- 
regression.  Therefore,  either  the  Levinson  algorithm  or  the 
lattice  methods  can  be  used  to  solve  for  Jj  and  from  it  and 
a  knowledge  of  t£_,  the  nonlinear  ARMA  model  coefficients  can 
be  obtained. 

By  analogy  with  equation  (3.27)  it  also  follows  that 
the  nonlinear  ARMA  equation  error  is  related  to  the  AR  pre- 
diction error  vector  by 


so  that 


e(k)  =  \pT    e(k)  (4.36a) 


e{e(k)2}  -    i\)T   ?   ip  (4.36b) 
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where  P  is  the  AR  prediction  error  covariance  matrix.   The 
coefficients  in  \\i   can  therefore  be  set  to  minimize  the  mean 
square  value  of  the  nonlinear  ARMA  equation  error  in  (4.36b) 
resulting  in  a  solution  given  by 
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NN 


a  (0) 
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c  (0,L   ) 
r     22 
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;N1 


(4.37a) 


where  N  is  the  total  number  of  channels  in  the  model 


N  =  1  +  1  +  (M22+l)  +  (N  2+l)  +  (L22+l) 


and  the  p..  are  the  elements  of  the  prediction  error  co- 
variance  matrix.   It  is  readily  apparent  that  the  linear 
ARMA  model  and  its  solution  via  the  Levinson  algorithm  or 
lattice  methods  are  a  special  case  of  this  formulation  of 
the  nonlinear  ARMA  model  just  as  one  would  expect. 
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V.   APPLICATIONS,  CONCLUSIONS,  AND  OPEN  QUESTIONS 

In  the  previous  four  chapters,  existing  methods  for  AR 
and  MA  modeling  were  reviewed  and  from  them  new  methods  for 
linear  and  nonlinear  ARMA  modeling  were  developed.   Here, 
two  applications  for  these  new  methods  in  reduced  order 
modeling  and  modeling  for  fault  detection  and  diagnosis 
are  examined  briefly.   Then  the  results  of  this  research 
are  summarized,  conclusions  drawn  and  significant  open 
questions  for  the  continuation  and  extension  of  this  work 
are  listed. 

A.   APPLICATIONS 

1.   Reduced  Order  Modeling 

Oftentimes,   complex  physical  systems,  both  linear 
and  nonlinear,  can  be  approximated  quite  closely  using 
simple  models.   The  lattice  solution  methods  developed  here 
provide  a  very  natural  and  efficient  means  of  determining 
reduced  order  models  for  complex,  high  order  systems  especially 
in  the  case  of  linear  systems.   In  Chapter  III  it  was  argued 
that  for  linear  systems  it  is  reasonable  to  build  up  ARMA 
models  by  simultaneously  incrementing  the  order  of  the  numera- 
tor and  denominator  polynomials  as  the  lattice  method  does . 
When  this  method  is  used  to  build  up  a  given  order  model, 
all  lower  order  models  and  their  mean  square  values  of  equation 
error  are  readily  obtained  as  well  (the  only  additional 


calculations  needed  are  for  the  MSE  and  the  gain  term  or 
gain  matrix  in  the  multichannel  case)  making  it  easy  to 
compare  the  various  models  and  decide  if  reduced  order  models 
provide  sufficient  accuracy. 

Consider  for  example  the  seventh  order  system  whose 
characteristics  are  listed  in  Table  5.1.   The  magnitude 
spectra  of  second,  third,  sixth  and  seventh  order  lattice 
models  obtained  using  batch  processing  with  4000  point 
averages  are  compared  to  that  of  the  system  in  Figure  5.1. 
It  is  apparent  that  a  second  order  model  is  unable  to 
approximate  the  system  well,  however  a  third  order  model  does 
provide  a  good  approximation.   Furthermore,  increasing  the 
model  order  to  four,  five  and  six  fails  to  significantly 
improve  its  performance  as  evidenced  by  the  sixth  order 
plot  in  Figure  5.1c.   The  model  accuracy  is  not  significantly 
increased  until  seventh  order  (which  corresponds  to  the  order 
of  the  actual  system)  when  a  very  good  fit  is  achieved.   This 
is  further  illustrated  by  the  plot  of  the  normalized  mean 
square  equation   error  as  a  function  of  model  order  shown  in 
Figure  5.2.   The  cost  drops  rapidly  going  from  second  to 
third  order  but  then  fails  to  decrease  significantly  until 
seventh  order  is  reached.   The  roots  of  the  system  and  the 
various  order  models  are  also  plotted  in  Figure  5.3. 

The  benefits  of  the  lattice  method  for  reduced  order 
nonlinear  ARMA  modeling  are  not  quite  so  pronounced  however, 
since  adding  extra  stages  to  the  lattice  corresponds  to 
allowing  the  kernels  to  grow  in  only  one  of  their  multiple 
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TABLE    5.1 

SYSTEN    E 

TRANSFER    FUNCTICN  COEFFICIENTS 

NUMERATOR  DENOMINATOR 
A(C)  =      1. CO  000 

A(l)  =    -2.16510  8(1)=       2.18950 

A(2)  =       1.5625C  B(2)=    -2.21260 

M2)  =       0.0  B(3>  =       1.51740 

4 <4)  =       0.0  E(4)=    -C. 85350 

A(5J=       0.0  B(5)=       C.45144 

AU)  =       CO  8(6)=    -C  23462 

M7)=      0.0  B(7)=       C.C9331 


ROCT  LOCATIONS 


ZEROS 


RE 
1.C625C 
1.C6250 

c.c 
c.c 

CO 

c.c 
c.c 


IM 
0.6250C 
-C. 62500 
0.0 
0.0 
0.0 
CO 
O.C 


PCLES 


RE 
0.9C000 
0.69282 
0.69282 
0.23941 
0.23941 
-0.28750 
-0.28750 


IM 

0.0 

C. 40000 
-0.4000C 

0.65778 
-0.65778 

0.49796 
-0.49796 
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Figure    5.1 


191 


(c) 
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(d)    -10 
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Figure  5.1  con't 


192 


E^n)xlOO 

yy 


.5 

.4 

- 

.3 

- 

.2 

- 

^ 

.1 

■ 

i 

i 

1              I 

i 

i 

(7) 

(75 

• 

l 

i 

0 

1 

2 

3          4 

5 

6 

7 

8 

9 

n 


Figure  5 . 2 


Mean  square  value  of  equation  error  (as 
a  percentage  of  the  mean  square  value  of 
system  output)  vs ,  model  order  for  lattice 
models  of  system  E  obtained  using  batch 
processing  and  4000  point  averages. 
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dimensions.   (The  linear  "kernel"  had  only  one  dimension,) 
Here  some  a  priori  information  would  be  useful  in  determining 
the  values  to  prespecify  for  the  boundaries  of  the  kernels 
in  the  other  dimensions.   Still  however,  when  compared  to  a 
brute  force  matrix  inversion  approach  where  all  kernel 
boundaries  must  be  prespecif ied ,  the  lattice  method  provides 
a  viable  alternative  in  obtaining  reduced  order  models 
especially  for  low  degree  systems. 

2 .   Modeling  For  Fault  Detection 

The  problem  of  fault  detection  and  possible  diagnosis 
can  be  formulated  as  follows. 

a.  Obtain  a  parameterization  that  describes  the 
current  functioning  of  the  system  under  test. 

b.  From  this  parameterization,  determine  if  the 
system  is  functioning  normally  or  if  a  fault 

has  occurred  by  comparison  with  a  fault  dictionary 
It  is  the  first  part  of  this  problem  that  has  been  addressed 
in  this  work.   The  parameterization  can  be  as  simple  as 
sampled  measurements  of  the  response  to  specific  inputs 
however,  the  large  volume  of  data  that  would  generally  be 
involved  in  such  an  approach  would  greatly  complicate  the 
second  part  of  the  problem.   A  more  efficient  approach  in 
terms  of  utilization  of  parameters  is  to  model  the  system 
and   use  the  model  parameters  as  a  description  of  its 
current  functioning. 

For  linear  systems,  ARMA  models  provide  a  very 
general  framework  with  a  number  of  possible  parameter  sets. 
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Three  candidates  are  polynomial  coefficients,  root  locations 
and  lattice  reflection  coefficients  with  the  latter  offering 
many  advantages .   In  addition  to  the  advantages  demonstrated 
by  the  experimental  results  of  Chapter  III,  reflection 
coefficients  provide  a  very  effective  and  methodical  way  to 
build  up  knowledge  of  a  system's  characteristics.   As  model 
order  is  increased  to  more  accurately  represent  a  system, 
reflection  coefficients  already  determined  don't  change 
making  them  ideal  candidates  for  use  in  a  dictionary  lookup 
scheme  (a  characteristic  not  shared  by  the  other  parame- 
terizations ) .   This  is  made  more  important  since  reduced 
order  models  could  be  adequate  to  detect  and  perhaps  diagnose 
some  faults,  especially  catastrophic  ones.   While  more 
parameters  are  required  when  reflection  coefficients  are 
used,  these  same  coefficients  also  provide  all  reduced  order 
models.   For  a  single  channel  ARMA  example,  8N  reflection 

coefficients  and  N  gains  provide  all  models  from  order  1  to 

2 
N  while  N  +2N  parameters  would  be  required  using  either 

polynomial  coefficients  or  roots  to  provide  the  same  infor- 
mation.  ( 6N  reflection  coefficients  are  needed  if  the  input 
is  white  noise  since  k,-=k9  =0.) 

A  similar  argument  could  be  made  for  the  use  of 
lattice  reflection  coefficients  with  the  nonlinear  ARMA 
model  for  fault  detection  and  diagnosis  of  nonlinear  systems. 
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B.   CONCLUSIONS  AND  OPEN  QUESTIONS 

The  purpose  of  this  research  was  to  extend  existing 
theories  and  methods  in  the  modeling  of  linear  and  nonlinear 
systems  to  broader,  more  general  types  of  models.   After  a 
discussion  of  available  results  in  AR  and  MA  modeling  of 
linear  systems  with  particular  emphasis  on  the  Levinson 
algorithm  and  lattice  filter  methods,  model  transition 
formulas  were  developed  to  relate  the  more  general  ARMA 
model  for  linear  systems  to  the  AR  and  MA  models.   It  was 
shown  that  with  suitable  assumptions,  the  ARMA  model  solu- 
tion could  also  be  obtained  recursively  in  order  using 
either  a  modified  Levinson  algorithm  or  lattice  filter 
methods.   These  results  were  developed  extensively  in  both 
theory  and  practice  for  single  channel  linear  ARMA  modeling 
with  experimental  verification  of  both  the  batch  processing 
and  adaptive  lattice  methods  presented.   Portions  of  these 
results  have  already  been  published.  [Refs.  65  and  66] 
The  theory  was  also  developed  to  generalize  these  results 
to  the  multichannel  ARMA  case. 

Based  on  the  simulation  results  it  was  concluded  that 
the  lattice  methods  offer  the  following  advantages  over  a 
conventional  brute  force  matrix  inversion  approach  to  ARMA 
modeling  using  windowed  correlation  estimates. 

1.   For  short  runs  of  data  the  batch  lattice  methods 
provide  much  more  accurate  results  than  the  brute 
force  method. 
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2.  The  batch  lattice  method  performs  much  better  than 
the  brute  force  method  when  the  system  is  over- 
modeled  , 

3.  The  MSE  as  a  function  of  model  order  is  well  behaved 
for  the  lattice  method. 

M- .   The  adaptive  lattice  method  has  difficulty  tracking 
zero  valued  overmodeled  parameters . 
The  cost  of  these  advantages  is  the  extra  computational 
burden  of  passing  the  data  through  the  lattice  filter  during 
the  modeling  process. 

In  the  discussion  of  nonlinear  system  models  the  Volterra 
model  was  considered  as  a  nonlinear  extension  of  MA  modeling 
and  it  was  shown  that  lattice  methods  could  be  used  to  obtain 
the  model  solution  if  the  problem  was  recast ,  using  the 
regular  form  of  the  Volterra  kernels.   Then  the  new  nonlinear 
ARMA  model  was  considered  and  it  was  shown  that  this  repre- 
sentation in  some  cases  solves  the  problem  of  requiring  a 
very  large  number  of  model  parameters  encountered  in  Volterra 
modeling.   Then  lattice  methods  were  developed  for  the  non- 
linear ARMA  problem  and  it  was  shown  that  the  linear  ARMA 
techniques  presented  earlier  are  a  speical  case  of  the  non- 
linear ARMA  methods.   For  both  types  of  nonlinear  models, 
the  recursive  in  order  nature  of  the  lattice  methods  was 
shown  to  allow  the  various  model  kernels  to  grow  in  one 
dimension  while  holding  their  boundaries  fixed  at  pre- 
specified  values  in  the  other  dimensions.   The  use  of  the 
model  nonlinear  ARMA  was  also  illustrated  with  two  examples 
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and  a  nonlinear  ARMA  model  was  proposed  for  the  tracking 
behavior  of  a  phase  locked  loop. 

Several  significant  questions  remain  for  the  continuation 
and  extension  of  this  work  and  are  listed  here, 

1.  Stability  of  the  linear  and  nonlinear  ARMA  models 
must  be  considered.   In  the  linear  problem,  stability 
is  dependent  on  the  roots  of  the  demoninator  poly- 
nomial of  the  synthesis  model.   The  methods 
developed  do  not  guarantee  stability  of  the  resulting 
model.   (This  was  not  found  to  be  a  problem  in 
practice,  however ,  unless  extremely  short  runs  of 
data  in  the  range  of  3  0  to  5  0  samples  were  used. 
Even  then,  model  instability  was  not  a  frequent 
occurrence.)   Stability  for  the  nonlinear  ARMA  model 
remains  to  be  clearly  defined. 

2.  Input  signal  requirements  in  the  nonlinear  ARMA 
modeling  process  need  to  be  investigated.   In  linear 
ARMA  modeling  the  power  spectrum  of  the  input  signal 
was  found  to  play  an  important  role.   No  requirements 
emerged  however,  on  the  probability  density  function 
(pdf)  of  the  input.   In  nonlinear  systems  where  the 
behavior  is  inherently  level  dependent ,  it  is  in- 
tuitively appealing  to  use  an  input  signal  with  a 
flat  power  spectrum  across  the  frequency  range  of 
interest  and  whose  amplitude  is  uniformly  distributed 
over  the  range  of  interest. 
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3 .  The  inability  of  the  adaptive  lattice  method  to 
track  zero  valued  overmodeled  parameters  is  an 
interesting  problem  warranting  further  consideration. 
If  some  means  of  detecting  the  degeneration  of  the 
cost  surface  towards  an  infinite  trough  can  be  found, 
the  problem  could  be  remedied  by  simply  reseting  the 
appropriate  parameters  to  zero. 

4.  Experimental  experience  needs  to  be  gained  with  the 
nonlinear  ARMA  model  itself  and  the  lattice  methods 
developed  for  it. 

5.  The  characteritics  of  the  lattice  solution  methods 
need  to  be  further  quantified  to  gain  a  comprehensive 
understanding  of  how  and  why  it  performs  as  it  does. 
Also,  further  comparisons  should  be  made  between 

the  lattice  methods  and  conventional  methods.   Some 
comparisons  were  made  here  for  batch  processing 
methods  but  only  a  rectangular  window  was  used  on 
the  data.   Comparisons  should  be  made  using  other 
types  of  window  functions  in  the  brute  force  method 
and  the  adaptive  lattice  method  should  also  be  com- 
pared with  a  conventional  LMS  adaptive  algorithm 
applied  to  the  equation  error  model  in  which  all  of 
the  a(i)  and  b(i)  coefficients  are  adapted  simul- 
taneously , 

6.  In  the  adaptive  lattice  method,  scaling  of  the  lattice 
input  signals  needs  to  be  investigated.   It  was 
noted  that  the  ratio  of  largest  to  smallest  eigenvalue 
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of  the  input  autocorrelation  matrix  was  related  to 
the  speed  of  convergence  of  the  adaptive  algorithm. 
For  the  first  lattice  stage  this  matrix  is 

y(k)  u(k) 


P      =  £  { 


y2(k) 


uCk)  y(k)   u  (k) 


} 


If  the  system  has  a  high  gain  such  that  the  mean 
square  value  of  the  output  y(k)  is  very  much  greater 
than  that  of  the  input  u(k) ,  convergence  will  be 
slow.   This  could  be  remedied  by  implementing  an 
adaptive  scaling  scheme  at  the  lattice  input  (perhaps 
similar  to  the  first  order  low  pass  filter  estimates 
used  for  adaptive  step  size  normalization) . 
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APPENDIX  A 
Alternate  Multichannel  Model  Forms 

Multichannel  generalization  of  the  MA  and  AR  models  were 
discussed  in  Chapter  II  along  with  their  solution  via  the 
Levinson  algorithm.   Here  the  multichannel  models  and  the 
Levinson  algorithm  are  developed  in  an  alternate  form  more 
compatible  with  other  linear  and  nonlinear  modeling  problems 
to  be  explored  later. 

Consider  first  an  N-th  order,  Q  -channel ,  AR  model  where 
the  current  value  of  the  signal  vector  x(k) 


T 
x(k)  '  =  [x  (k)  ...  x   (k)]  (A. la) 

1         ^0 


T 
X(z)  =  [X, (z)  ...  Xn  (y)]  (A. lb) 


1      •"     % 


is  to  be  predicted  from  a  weighted  combination  of  N  past 
values  of  each  of  the  component  signals.   For  each  signal 
this  estimate  can  be  written  as 

Qo       N 

,(k)  =  2-r  2-f 


x.kk)    =   -^  L-a      d..(n)  x.(k-n)  (A. 2a) 

3       ftl   n=l    ^      x 


or 


r\ 


•S'^: 


X.(k)  =  /j  d.  .(z)  X.(z)  (A2.b) 

3       f=t  '  13      l 


where         N 

.  d. 
•  3 


l.  .(z)  =  y.  d. 

n=l 


d.  .(z)  =  /'      d.  .(n)  z"n  (A. 2c) 
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Define  an  N-vector  for  each  of  the  Q   channels  to  contain 
their  required  time  histories  as 


x.(k)  =  Cx.(k-l)  . . ,  x.(k-N)] 
-11  1 


(A. 3a) 


for  1  <   i  <■   Q   and  embed  all  of  these  vectors  into  a  single 
NQn~data  vector  given  by 


X(k)  =  Cx^Ck)' 


x„  (k)1] 


=Q 


o 


(A. 3b) 


Define  a  NQ   x  Qn  matris  of  weights  as 


D 


in 


V 


v 


\*o 


(A. 4a) 


where  the  N-vector s  d..  are  given  by 

— lj      °      J 


d..  =  [d..(l)  ...  d.jCN)] 


(A. 4b) 


and  contain  the  coefficients  of  the  polynomials  d..(z). 
These  polynomials  can  be  combined  in  a  matrix  polynomial 
defined  by 


D<z)  = 


dlx(z) 


v(z) 


dVz) 


dQ  Q  (Z) 


(A. 4c) 
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With  these  definitions,  the  N-th  order  prediction  of  x(k) 
and  its  associated  prediction  error  vector  becomes 


x(k)T  =    X(k)T  D 


(A. 5a) 


rn  rn        /\  rn 

e(k)1  =  x(k)   -  x(k) 


(A. 5b) 


or  in  the  transform  domain 


E(z)T  =  X(z)T  [I  -  D(z)] 


(A. 5c) 


Comparison  of  equation  (A.  5c)  and  (2.4-|4c)  show  that  this 
matrix  polynomial  differs  from  the  more  generally  used  form 
of  (2.4-4-c)  by  a  transposition.   The  coefficient  matrix  JD 
can  be  found  by  minimizing  the  trace  of  the  prediction 
error  covariance  matrix 


P  =  s{e(k)  e(k)  } 


(A. 6) 


leading  to  a  system  of  linear  equations  given  by 


s{X(k)    X(k)T}    D   =    s(X(k)    x(k)T} 


or 


R  R 

xlxl  X1XQ, 


R 

"xQ0xi 


R 


D 


r 

-x1x1 


JV1 


(A. 7a) 


■xixQ, 


r 


(A. 7b) 
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Adopting  a  shorthand  notation,  this  becomes 


R  D 


(A. 7c) 


As  in  the  case  of  a  single  channel  autoregression,  the 
multichannel  generalization  of  the  Levinson  algorithm  also 
requires  that  the  backward  prediction  problem  of  estimating 
x(k-N)  backward  in  time  from  the  values  of  x(k-N+l)  through 
x(k)  ,  be  solved  simultaneously.   Using  an  overscore  to 
indicate  quantities  associated  with  the  backward  problem  it 
follows  that 


—        T   —  T 

X(k)       D      =      x(k) 


(A. 8a) 


T  T         *        T 

e(k)      =    x(k-N)      -   x(k) 


(A. 8b) 


or 


T  T  — 

E(z)      =    X(z)      z"N[I    -    D(z)] 


(A. 8c) 


_  T  T    T 

where        _X(k)    =    Cx,  (k)  ...         x_    (k)    ] 

1  ^0 


(A.8d) 


X. (k)    =    Cx.(k-N+1)     ...    x. (k)] 


(A,8e) 


and 


D 


in 


V 


-iQ, 


^nQ 


0^0 


(A.8f) 
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d.  .    =    [d. .(1) 


d. .(N)] 


(A.8g) 


with  D(z)    = 


dxl(z) 


Vz) 


N 


d         (z) 

1Q0 


QnQ 


OyO 


and 


i] 


■  E 


d.  .  (z)    =     Z-/     d.  .  (n)    z 


n 


n=l 


i] 


(A,8h) 


(A.8i) 


Setting  the  coefficients  of  this  backward  predictor  to  mini- 
mize the  trace  of  the  backward  prediction  error  covariance 
matrix 


P   =   e{e(k)  e(k)T} 


(A. 9) 


leads  to  a  MMSE  solution  given  by 


e{X(k)  X(k)  }  D   =   s{X(k)  x(k-N)T} 


(A. 10a) 


and  since   e{x.(k)  x.  ,,   x  }  =  R 

-l     -i ( k )      -x . x . 

i  3 


and        s{x. (k)  x. (k-N)}  =  r 

—l     —  j  — x  .  x  . 


this  can  be  written  as 
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R 


-x1x1 


V1 


T 
R 

X1XQ 


0 


s T 

xQ0XQo 


D 


r 
-x1x1 


r 

-xixQ, 


Adopting  a  shorthand  notation,  this  becomes 


V1 


'*%*% 


(A.lOb) 


R    D 


(A. 11) 


At  this  point  it  is  important  to  take  note  of  a  subtle 
difference  between  the  single  and  multichannel  problems  that 
has  arisen.   While  the  single  channel  equivalent  of  equation 
(A. 10b)  for  the  backward  predictor  was  not  written  previously 
it  is  obviously 


T  — 
R  l  d 
-x1x]_   - 


r 
~xlxl 


(A. 12a) 


and  since 


R 


x1x1 


^x  x 

xlxl 


(A. 12b) 


it  follows  that  the  single  channel  forward  and  backward 
prediction  problems  have  exactly  the  same  solution  (as  was 
found  to  be  the  case  in  Section  II. D.  when  the  AR  prediction 
error  lattice  was  developed).   This  fact  is  responsible  for 
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a  number  of  simplifications  in  the  single  channel  case; 
a.)   Only  a  single  reflection  coefficient  k      was 
needed  to  link  the  n-th  and  (n+l)-st  order  solu- 
tions for  both  the  backward  and  forward  problems 
(see  equations  (2.25b)  and  (2.25c)). 
b.)   |B(z)|   =   |B(z)| 
c.)   e{e(k)2}  =  e{e"(k)2} 

d.)   The  development  of  the  Burg  method  and  the  Itakaura- 
Saito  method  for  calculating  the  reflection 
coefficients  are  a  direct  consequence   of  c  above. 
Unfortunately  however,  none  of  these  simplifications  carry 
over  into  the  more  general  multichannel  AR  problem  because 
by  comparing  equations  (A. 7b)  and  (A. 10b)  it  is  evident 
that  in  general  D  i      L)  . 

Proceeding  as  in  the  single  channel  case,  it  is  shown 
in  Appendix  B  that  because  of  the  structure  in  the  correlation 
matricies  JR  and  J\  ,  equations  (A. 7c)  and  (A. 11)  can  be 
re-written  as 

R   F   =  P  (A. 13a) 


and 


R   F   =  P  (A. 13b) 


where  F3  _F_,  _P_   and  P   are  obtained  from  E),  D,  T_   and 

f  respectively  by  taking  their  component  vectors  and  turning 
them  upside  down  in  place;  i.e. 
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£n 


V 


*1Q, 


h 


oxo 


(A. 14a) 


where 


and 


with 


f . .    =    [d. .(N)    ...    d. .(1)] 

-i]  13  l] 


P 


-x1x1 


X1XQ 


0 


PxQ0xi     ■■"        PxQ0xQ0 


(A. 14b) 


(A. 14c) 


£x.x.  =  [R     (M)   ...   R     (1)] 
11       X  •  X  .  X  .  X  . 

J     i  :        i  ] 


(A.14d) 


As  will  soon  be  evident,  the  relationships  of  equations  (A. 13) 
form  the  cornerstone  for  the  Levinson  algorithm  and  are  made 

possible  because  the  blocks  comprising  the  R_  and  _R  matricies 

T 
are  themselves  Toeplitz  and  because  R     =  R 

r  — X  •  X  .     — X  .  X  . 

Assume  that  the  (n+l)-st  order  solutions  can  be  related 


to  the  n-th  order  solutions  by 

u(n)~ 
(n+1) 


d 
-1J 


d 

-13 


0 


-13 


k 


(n+1) 

'ij 


(A. 15a) 
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and 


-r(n+l) 
d  .  • 


-13 


-Cn) 

-13 


Lr^3   J 


(A. 15b) 


Embedding  the  vectors  z . .   and  e . .   and  the  coefficients 

-13      -i] 

.  (n+1)    ,  r(n+l)  •  .     .  .      .       .  ,  ,-(n)    .  p-(n) 
k.  .     and  k.  .     into  matrices   designated  £    and  £•  ■> 
lj         lj  &      —        —    ' 

K      and  K       and  solving  for  them  in  the  (n_l)-st  order 
problem  it  is  shown  in  Appendix  C  that 


.(n)    _  p"(n)  K(n+1) 


(A. 16a) 


^(n) 


c^n;  =  _  p^n;   K 


(n)  ^(n+1) 


(A. 16b) 


K(n+1)  =  [r  (o)-  r(n>TD(n)]  [r  (n+i)-/^(n)TD(n)] 


—xx 


-XX 


(A. 16c) 

K(n+1)  =  [r  co)-r(n)TD(n)]"1CR  (n+i)T-/^(n)TD(n)] 

—  —xx     —     XX        —     — 

(A.16d) 

Also  the  inverted  terms  on  the  right  hand  side  of  equations 
(A. 16c)  and  (A.16d)  are  just  the  backward  and  forward  pre- 
diction error  covariance  matrices   for  the  optimum  n-th 
order  models  so  that  (A. 16c)  and  (A.16d)  become 
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K(n+1)  .  p(n)^[R   (n+1)./?^)TD(n)]  (A. 17a) 

K(n+1)  =  P(nrl[R   (n+l)T-^(n)TD(n)]  (A. 17b) 


As  in  the  case  of  the  single  channel  problem,  the  prediction 
error  covariance  matrices   obey  the  recursions 


(n+l)  .  p(n)[I_K(n+l)K(n+l)]  (A_18a) 


pCn+1)  =  F(n)[I.KCn+l)jf(n+l)]  (A. 18b) 


completing  the  multichannel  generalization  of  the  Levinson 
algorithm  for  AR  models . 

Comparing  equations  (A. 16),  (A. 17)  and  (A. 18)  with  their 
single  channel  counterparts  (2.18)  and  (2.19)  it  is  clear 
that  the  multichannel  Levinson  algorithm  simply  represents 
a  matrix  algebra  generalization  of  the  single  channel  al- 
gorithm.  Once  again,  predictors  of  all  orders  0  _<  n  _<  N  are 
obtained  in  the  process  of  finding  the  N-th  order  predictor 
along  with  all  their  prediction  error  covariance  matrices 
and  the  overall  minimization  requiring   the  inversion  of 
a  QnN  x  QnN  matrix  is  replaced  by  a  sequence  of  N  minimiza- 
tions, each  requireing  the  inversion  of  two  Qn  x  Q   matrices 

/■□(n)      k^)  \ 

(P     and  P    ) . 
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Using  the  relationships  given  in  equations  (A. 15)  and 
(A. 16),  successive  orders  of  matrix  polynomials  can  also 
be  related  to  one  another  by 

[I-D(n+1)(z)]  =  [I-D(n\z)]-z-Vn[I-D(n\z)]K(n+1) 

(A. 19a) 

z-(n+1)[I-D(n+1)(z)]  =  z-1z-n[I-D(n)(z)]-[I-D(n)(z)]K(n+1) 

(A. 19b) 

T 
Premultiplymg  both  sides  of  these  equations  by  X(z)  ,  trans- 
posing, and  transforming  into  the  time  domain  provides  rela- 
tionships between  the  prediction  error  signals  at  each  stage. 


e(n+1)(k)  =  e(n)(k)  -  K(n+1)T  e(n)(k-l)  (A. 20a) 


e"(n+1)(k)  =  e(n)(k-l)  -  K(n+1)T  e(n)(k)  (A. 20b) 


Recognizing  that  for  a  zeroth  order  prediction,  the  forward 
and  backward  prediction  error  vectors  are  just  the  input 
vector  itself,  it  follows  that 


e(0)(k)  =  e(0)(k)  =  x(k)  (A. 20c) 


and  the  multichannel  equivalents  of  equations  (2.28)  have 
been  obtained. 
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Next  consider  the  N-th  order  multichannel  MA  model  in 

which  the  current  value  of  a  Q  -vector  of  output  signals 

y_(k)  is  to  be  predicted  from  the  present  value  and  N  past 

values  of  a  Q. -vector  of  input  signals  x(k)  . 

Qi    N 

y.(k)  =  Zj   2-j    d..(n)  x.(k-n)  (A. 21a) 

i=l  n=0 
Qi 


or 


Y.(z)  =  /j    dt.(z)  X.(z)  (A. 21b) 


i  =  l 


where         N 


■  E 


d. .(z)  =  /-J    d. .(n)z  n  (A. 21c) 

n=0 

Using  a  superscript  "+"  to  indicate  the  fact  that  the  vectors 
are  indexed  from  0  to  N  rather  than  from  1  to  N,  define 
(N+l)-vectors  for  each  of  the  input  channels  to  contain  their 
required  time  histories  as 


T 
x.(k)  =  Cx.(k)   ...   x.(k-n)]  (A. 22a) 

-1  1  ] 


and  embed  these  vectors  into  a  single  Q . (N+l) -data  vector 


given  by 


T 
X  +  (k)    =  [x*(k)T    ...    xt  )k)T]  (A. 22b) 

^i 
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Define  a  Q.(N+1)  x  Qn  matrix  of  weights  as 


D+  = 


iii 


4* 


0 


"Q-Qn 


(A. 23a) 


where  the  (N+l)-vectors  d..  are  given  by 

—13      °       J 


d. .  =  [d.  .(0) 
-13      1] 


,  .  d. .(N)] 


(A. 23b) 


and  contain  the  coefficients  of  the  polynomial  d..(z).   These 
polynomials  can  be  combined  into  a  single  matrix  polynomial 
given  by 


D+(z)  = 


d^x(z) 


dQ.l(z) 


d1Q  <») 


VoU) 


(A. 23c) 


With  these  definitions,  the  N-th  order  MA  prediction  of  y(k) 
is  given  by 


+  /,  nT    rw+ 


Z(k)x  =  X/oo1  D 


(A. 24a) 
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or  in  the  transform  domain, 


T 


T  TA  + 


Y(z)1  =  X(z)1  D  (z) 


(A. 24b) 


Defining  a  prediction  error  vector  as 


eQ(k)       -    y(k)1  -  y(k)1 


(A. 25a) 


and  setting  the  coefficients  in  Jj     to  minimize  the  trace 
of  the  prediction  error  covariance  matrix 


PQ  =  e{e0(k)e0(kr} 


(A. 25b) 


results  in  a  solution  given  by 


e(X+(k)  X  +  (k)T}  D+  =  e(X+(k)  y(k)T}        (A. 26a) 


or 


*  +  + 
xixi 


XQ.X1 


X1XQ. 


s  +  + 

x^  x, 


'Qi  Qi 


D+ 


"xQiyi 


•  E  + 


xiyQ, 


r 


<iy% 


(A. 26b) 
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Adopting  a  shorthand  notation  this  becomes 


R+  D+  =   r+ 


(A. 26c) 


Assume  a  relationship  between  the  components  of  the 
(n+l)-st  and  n-th  order  solutions  given  by 


,+(n+l)  _ 

d .  . 


"d!?n)" 

+ 

0 

_       __ 

*L3 


(n+1) 


(n+1) 


'i] 


(A. 27) 


Embedding  the  vectors  v..  and  the  coefficients  g.. 

-13  13 

•  *.     4.  •  ■    ,4      4.  a    /tCn+1)    ,  ~(n+l)     ,    ,  . 
into  matrices   designated  (/      and  G      ,  and  solving 

for  them  in  the  (n+l)-st  order  MA  problem  it  is  shown  in 

Appendix  D  that 


j*(n+l)  =  _  p(n+l)  G(n+1) 


(A. 28a) 


G(n+1)  a  pCn-l)"1^   (n+l)-^(n+1)TD+(n)] 

xy 


(A. 28b) 


—(n+1)  — (n+1)     _(n+l) 
where  _T_      fl  amd  P      are  matrices   that  emerge 

from  the  backward  prediction  problem  in  a  multichannel  (n+l)-st 

order  autoregression  on  the  input  signal  vector  x(k) .   Again, 

it  is  clear  that  the  multichannel  MA  solutions  given  by 

equations  (A. 28)  are  a  matrix  algebra  generalization  of 

equations  (2.23)  for  the  single  channel  MA  model. 
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To  relate  successive  order  MA  matrix  polynomials  to  one 
another,  equations  (A, 27)  and  (A. 28a)  can  be  used  to  write 


D+(z)(n+1)  =  D+(z)(n)  +  z-n[I-D(n)(z)]G(n+1) 


(A. 29) 


where  the  second  term  on  the  right  hand  side  that  premul- 
tiplies  G  is  the  backward  prediction  error  matrix  polynomial 

from  the  autoregression  on  the  input  signal  x(k) .   Pre- 

T 
multiplying  by  X  (z),  transposing,  and  transforming  into 

the  time  domain  results  in  the  multichannel  equivalent  of 

equation    (2.37) 


"(n+1),,.         A(n),,.        n(n+l)T-(n+l)M  ,  ,&    0rO 

y  (k)    =    y         (k)    +    G  e  (k)  (A. 30) 


and  completes  the  derivation  of  the  recursive  in  order 
solutions  for  the  multichannel  MA  model. 
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APPENDIX  B 
A  Key  Relationship  For  The  Multichannel  Levinson  Algorithm 

Equation  (A. 13)  is  a  key  relationship  in  the  development 
of  the  Levinson  algorithm  responsible  for  much  of  the 
algorithm's  simplicity.   Without  this  relationship,  more 
than  just  the  K  and  K  matrices  would  be  needed  to  obtain 
the  new  model  from  the  previous  model. 

In  equation  (A. 7),  consider  the  multiplication  of  the 

i-th  block  row  of  _R  by  the  j-th  block  column  of  JD  to 

form  r 

— x-x. 

i    ] 


R  d, .+...+  R  d_    .    =    r 

— x.x,— lj  — x.x„    — Qn2         — x-x. 

l    1      J  l   QQ    X0J  l    j 


(B.l) 


In  particular,  consider  a  general  term  on  the  left  hand 
side  of  equation  (B.l)  in  detail. 


R   „  (0) 
x  .x 

l  m 


R     (1-N) 
x.x 

i  m 


R     CN-1)      R     (0) 
x.x  x-x 

i  m  l  m 


d  .(1) 

m: 


d  .(N) 
m] 


Rx.x.(1) 
i  ] 


R     (N) 
*iXj 


(B.2) 


Define  upside  down  versions  of  the  d..  and  r     vectors 
v  —in      —x.x. 

J       i  ] 


as 


f .  .  = 
-13 


d.  .(N) 

^3 


d..(l) 

13 


(B.3a) 
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and 


-x .  x . 

i  1 


R     (N) 
x  .x . 

i  3 


R     (1) 
x  .x . 

i  1 


(B,3b) 


Using  these  permuted  vectors  in  equation  (B.l)  in  place  of 
the  d  and  r  vectors,  the  relationship  is  still  satisfied  if 
the  R  matrices   are  permuted  as  well.   In  particular,  from 
(B.2)  it  is  evident  that 


R   v  (0) 
x  .x 

l  m 


R     (1-N) 
x  .x 

l  m 


R   v  (N-l) 
x  .x 

i  m 


.  .   R   „  (0) 
x  .x 

i  m 


d.  .(N) 


d. .(1) 


R     (N) 
x  .x . 

i  1 


R  v  CD 
X  .x. 

1  ] 


or  equivalently 


(B.4a) 


T  T 

R      f ,.+...+  R   „    fn  .  =  pv  v 
-xixi   -1]  -x.x^  -Q0:    i^x. 


(B.Lfb) 


Embedding  the  f . .  and  p     vectors  into  matrices   designated 
6      -1]      -x.x.  to 

F  and  _^  respectively  and  using  the  definition  of  _K  it 
follows  that 


R  F  =  /* 


(B.5) 


Defining  F.  .  and  p      as  upside  down  versions  of  their 
°  —11      — x.x.      r 

corresponding  vectors  m   JLJ  and  _T^  in  equation  (A. 11)  and 
embedding  them  in  matrices   designated  r  and  /^s  it  also 
follows  from  a  similar  development  that 

R   F_   =   ^  (B.6) 

Equations  (B.5)  and  (B.6)  are  essential  to  the  develop- 
ment of  the  Levinson  algorithm  and  are  made  possible  by  the 
Toeplits  structure  of  the  block  components  of  the  R  and 
R  matrices. 
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APPENDIX  C 

The  Multichannel  Levinson  Algorithm  Solution 

For  AR  Modeling 


In  the  (n+l)-st  order  versions  of  equations  (A. 7)  and 
(A. 10),  the  component  matrices   that  make  up  R  and  JR  can 
be  written  as  follows 


R 


(n+1) 

:x.x . 

i  3 


R  (n) 

i  3 


R 


x.x.(-N) 

i  3 


R 


x.x. (-1) 
1-2 


Rx.x.(N) ' ' -Rx.x-(1) |  R   v  ,n. 
l  ]         l  :    J   xiXj(0) 


R 


(n) 

x.x. 

i  ] 


p  (n) 

-x.x. 

3  i 


-  (n) 

Px.x. 
i  3 


R 


x.x. (0) 

i  3 


(C.I) 


and 


_(n+l) 

*-x  x 
xixj 


(n+1) 

*x.x. 

i  3 


T 
R(n)T 

—x.x. 

i  3 


(n)1 

.  x  . 

i  3 


-x 


(n) 
.x . 

i  3 


-x.x 


R 


x  .  x  .  (  0  ) 

l  1 


(C.2) 
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Additionally 


rCn+l) 

-x.x. 


,(n) 
-x.x . 

i_3 


vx.x.  (n+1) 

i  3 


(C.3) 


With  these  matrices   and  vectors  written  in  this  partitioned 
form,  and  with  the  relationships  assumed  in  equations  (A. 15) 
between  the  n-th  and  (n+l)-st  order  solutions,  the  (n+l)-st 
order  modeling  equations  become: 

Forward  Model: 

R(n)  jj(n)  +  R(n)  c  (n)  +  ^(n)K(n+l)  =  r  (n)    (c>4a) 


/?(n)  D(n)  +  /?(n)  c(n) 


(n+1) 
—xx    — 


(C.t+b) 


Backward  Model: 


T  — 


Jj(n)  £)(n)  +  JJ(n)   £  (n)  +  ^(n)  ^(n+1)  _  r  (n) 


(C.5a) 
^(n)T  g(n)  +  ^(n)T  £*(n)  +  ^(Q)^(n+l)  =  R^Cn+l)7 


(C.5b) 


223 


Equations  (C.M-a)  and  (C,5a)  contain  the  n-th  order  modeling 
equations  within  them  however,  and  therefore  can  be  written 
as 

R(n)  ^(n)  s  m   ^(n)  ^(n+1)  (c>5a) 

R(n)  £-(n)  __      _  ^(n)  ^Cn+1)  (c>6b) 

and  applying  the  relationship  developed  in  Appendix  B 
(equations  (B.5)  and  (B.6))  yields 

c(n)   a   _  fCn)  K(n+1)  (c>7a) 


c(n)    ■  =    -  F(n)  K(n+1) 


(C.7b) 


Thus  the  e_   and  £  vectors  have  been  found  in  terms  of  the 
known  quanties  f  and  f  and  the  unknown  K  and  K  matrices. 
Substituting  equations  (C.7)  into  (C.<4b)  and  (C.5b)  completes 
the  solution  resulting  in  expressions  for  the  K  and  K 
matrices   given  by 


K(n+1)  =  [R   (0)  -  r(n)TD(n)]-1[R   (r+1)  _^(n)T  jj(n)  3 

__  XX  '         ■■  —XX  "  ■ 


(C.8a) 
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K(n+1)  =  [r  (o)-r(n)TD(n)]"1CR  (n+i)T-^(n)TD(n)] 

—         —xx     —    >=■       — xx       —    — 

(C.8b) 


The  terms  on  the  right  hand  side  of  these  equations  that 
are  inverted  are  just  the  backward  and  forward  prediction 
error  covariance  matrices,  for  the  optimum  n-th  order  models 
given  by 


P(n)  =  R   (0)  -  r^)TD(n)  (C.9a) 

—      —xx      —    — 


p(n)  =  rxx(o)  -  _r_(n)  D(n)  (c.9b) 


Using  equations  (C.3),  (A. 15)  and  (C.7),  the  forward  pre- 
diction error  covariance  matrix  for  the  optimum  (n+l)-st 
order  model  can  therefore  be  written  as 


p(n+l)  _  R   (0)_rCn)  jj(n)+r(n)  p (n)K(n+l) 


-  R   (n+l)TK(n+1)  (C.lOa) 

—xx      — 


p(n)  +  [  r  (n)Tp(n)^R   Cn+1)T]K<n+l)        (c.lOb) 
—     —    —    —xx       — 


p(n)[I_p(n)-1[^(n+1)T_^(n)To(n)]^(n+l)] 


(C.lOc) 
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p(n+l)=  p(n)[I_^(n+l)K(n+l)]  (C.lOd) 


and  following  a  similar  development  for  the  backward  pre- 
diction error  covariance  matrix  results  in 


pCn+1)  =  ¥(n)LJ   _  K(n+l)^(n+l)]  (c>11) 
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APPENDIX  D 

The  Multichannel  Levinson  Algorithm  Solution 
For  MA  Modeling 


First  note  that  because  of  the  definitions  of  the  x.(k) 

—l 

and  x. Ck)  vectors  (indexed  from  0  to  n  and  1  to  n  respec- 
tively) in  the  n-th  order  models,  the  matrices   R      in 

x.x. 
the  n-th  order  MA  problem  could  also  be  written        in 

terms  of  x.(k)  and  x.(k)  as  (assuming  stationarity ) 
— i        —  j  &  J 


pCn) 

±  +   + 
x.x . 

i  3 


R(n+1) 

^x.x. 

i  3 


(D.la) 


so  that 


R+ 


(n) 


R 


(n+l) 


(D.lb) 


(n+l) 


(n+l) 


The  components  of  _R        and  _T_'       in  the  (n+l)-st 
order  MA  modeling  problem  can  be  written  in  partitioned  form 
as 


R(n+1) 

£  +  + 
x.x. 

i  3 


*  +   + 
x.x. 

i    3 

-(n+l) 

P-x.x. 

3    i 

-(n+l) 

-x.x. 
i  3 


•x.x.(O) 

i  3 


(D.2a) 
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and 


(n+1) 
—  + 
x.y. 


(n) 

£  + 
x.y. 


R     (n+1) 
x.y. 


(D.2b) 


Using  these  partitioned  forms,  and  the  relationship  in 
equation  (A. 27)  between  the  n-th  and  (n+l)-st  order  solutions, 
the  (n+l)=st  order  modeling  equations  become 


(n)    ,(n) 


(n) 


R+     D+     +  R+     ^(n+1)  +  ^(n+l)G(n+l)=r  + 


(n) 


(D.3a) 


^(n+l)Tp+(n)  +  ^(n+l)T  £<n+l)  +  ^(Q)G_(n+l) 


=  R   (n+1) 
-xy 


CD. 3b) 


Equation  (D.3a)  contains  within  it,  the  modeling  equation  for 
the  n-th  order  MA  model  and  using  equation  (D.lb)  can  be 
rewritten  as 


gCn+l)  J  (n+1)    _  /9(n+l)G(n+l) 


(D.4) 


A  comparison  of  this  result  with  equation  (C.6a)  shows  that 


1}   (n+1)    __    Y  Cn+l)G(n+l) 


(D.5) 
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where   F_       is  the  permuted  version  of  the  backward 
solution  in  an  (n+l)-st  order  AR  model  of  the  input  signal 
vector  x(k)  .   Furthermore,  substituting  equation  (D.5)  into 
(D.3b)  results  in  a  solution  for  G  n    given  by 


G(n+1)  ;  CExx(0)  -  ^(n+l^jpCn+l)^1 


[R   (n+l)-/<5(n+1)TD+(n)]  (D.6) 

— xy      —      — 


Since 


/<^(n+l)Tp(n+l)  _  r  (n+l)T£)(n+l) 


it  follows  that  the  inverted  term  on  the  right  hand  side 
of  equation  (D.6)  is  just  the  backward  prediction  error 
covariance  matrix  for  the  optimum  (n+l)-st  order  AR  model 
on  the  input  signal  vector  x(k). 
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APPENDIX  E 
Prony ? s  Method  For  ARMA  Modeling 


Prony ' s  method  [Refs.  8,  5  2  and  56]  obtains  a  zero  pole 
model  for  a  system  by  matching  the  impulse  responses  of  the 
system  and  model  over  the  first  M+N+l  sample  intervals  where 
M  and  N  are  the  orders  of  the  model  transfer  function  num- 
erator and  denominator  polynomials.   Assume  that  a  signal 
y(k)  is  available  that  represents  the  impulse  response  of 
a  causal  system  and  that  a  rational  transfer  function  model 
for  this  systems  is  desired.   Using  a  """  to  denote  the 
model  output  and  u(k)  to  denote  the  input  to  both  the  system 
and  model,  the  model  transfer  function  is  given  by 


-n 


H(z)  = 


M 

S   a(n) z 

Y(z)    n=0 

uTzT  '     N 

1+  l    b(n)z 
n=l 


(E.l) 


-n 


For  a  unit  sample  input  it  follows  that 


B(z)  Y(z)  =  A(z) 


(E.2) 


Equating  like  powers  of  z  in  this  relationship  results  in 


N 

iLi   b(i)  y(n-i)  = 
i=0 


where  b(0)=l. 


a(n)  ;    0  <  n  <  M 


else 


(E.3) 
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Equating  y(k)  and  y(k)  over  the  interval  0  <  k  _<  M+N  produces 
a  set  of  M+N+l  equations  which  can  be  expressed  in  matrix 
form  as 


y(0) 
y(l) 


y(M) 


H 


y(M+l) 


y(M+N) 


0       0    .  .  .   0 
y(0)    0    .  .  .   0 

y(M-l)  y(M-2)  ... 


y(M)    y(M-l) 


y(M) 


b(l) 


b(N) 


a(0) 


a(M) 
0 


(E.4a) 


or  adopting  a  shorthand  notation 


*1 


Si 


y-i        Y-2 


(E.4b) 


Assuming  that  the  NxN  matrix  Y~  is  not  singular  if  follows  that 


b  =  -Y-1  y2 


(E.5a) 


+     +       -1 

a   s  Zi  -  Ii  l2  z-2 


(E.5b) 
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The  original  Prony  method  goes  on  to  form  a  partial 
fraction  expansion  and  inverse  transformation  on  the  model 
transfer  function  H(z)  resulting  in  a  model  for  the  impulse 
response  of  the  system  given  as  a  sum  of  complex  exponentials. 
This  in  unnecessary  here  however,  since  a  rational  transfer 
function  model  is  the  form  sought. 

Prony ' s  method  inherently  assumes  that  matching  a 
sufficient  portion  of  the  impulse  response  of  the  system 
results  in  an  accurate  model  but  this  is  not  necessarily 
the  case  unless  the  impulse  response  damps  out  quickly  or 
unless  the  system  can  be  represented   exactly  by  a  low  order 
model.   Otherwise  a  very  high  order  model  may  be  required 
to  obtain  sufficient  accuracy.   Other  difficulties  asso- 
ciated with  this  technique  are: 

1.)   The  system  impulse  response  must  be  available; 

2.)   There  are  no  built  in  mechanisms  to  test  for  or 
ensure  stability  of  the  model ; 

3.)   There  is  no  averaging  of  noise  in  the  data; 

4.)   Only  a  small  portion  of  the  available  data  points 
(M+N+l)  are  actually  used. 

These  difficulties  can  be  partially  overcome  by  modifying 
the  procedure  to  obtain  an  approximate  match  of  the  system 
and  model  responses  over  their  entire  duration  rather  than 
an  exact  match  over  the  first  iM+N+1  points.   Consider 
equations  (E.4)  modified  to  include  the  entire  signal  y(k) 
for  0  <  k  <  °°. 
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y(0) 
y(l) 


0 
y(0) 


y(M) 


y(M-l) 


(E.6a) 


y(M+l) 


y(M) 


Adopting  a  shorthand  notation  this  becomes 


*1 


Y-l 


^3 


Y-3 


1 

b 


(E.6b) 


This  yields  two  equations 


+  + 

+  Y,  b  =  a 


*1  +  il  S 


(E.7a) 


y3  +  Y3  *  =  £ 


(E,7b) 


but  with^y - ,  Y~  and  0_  having  an  infinite  number  of  rows, 
equation  (E.7b)  will  in  general  have  no  solution.   In  practice 
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only  a  finite  portion  of  the  system  impulse  response  y(k) 
can  be  considered  but  equation  (E.7b)  will  still  be  incon- 
sistent in  general.   A  least  squares  estimate  of  b  can  how- 

.  .  .  .     T 
ever,  be  obtained  by  minimizing  e  e  where 


e  =  Y3  b  -  (-y3)  (E.8a) 


resulting  in 


b  =  CYgYg)    Y3  y_3  (E.8b) 


which  in  turn  can  be  used  in  equation  (E.7a)  to  find  a  . 
This  approximate  version  of  Prony's  method  is  the  one  most 
commonly  applied. 
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APPENDIX  F 
Parabolic  Surfaces  In  n-Dimensions 

Multi-dimensional  parabolic  surfaces  are  described  by 
an  equation  of  the  form 


y=xAx-2xTb+c  (F.l) 


where : 

y  is  the  independent  variable; 

x  is  a  vector  of  dependent  variables; 

A  is  a  symmetric  constant  matrix; 

b  is  a  constant  vector; 

c  is  a  constant. 
(x,  b  and  c  can  also  be  considered  as  matricies  with  the 
trace  of  the  right  hand  side  set  equal  to  the  independent 
variable  but  the  problem  remains  essentially  unchanged.) 
Completing  the  square  for  nonsingular  A  this  becomes 


y  =  (x  -  A"1b)T  A(x  -  A"xb)  +  c  -  bTA_1b  (F.2) 


so  that  for  positive  definite  A  it  is  clear  that  the  minimum 

value  of  y  is  obtained  for  x  =  A"  b  and  this  minimum  is 

T  -1 
c  -  b  A  b.   It  is  also  clear  that  nonzero  values  of  b  and  c 

simply  raise  and  lower  the  surface  and  move  the  minimum 

point  away  from  x  =  £.   The  shape  of  the  surface  (its 

relative  concavity  or  flatness)  is  determined  by  the  matrix 
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A  in  the  quadratic  term  of  equation  (F.l).   Therefore,  to 
study  the  shape  of  this  surface,  consider  the  simpler  pro- 
blem with  b  and  c  set  to  zero. 


y  =  xT  A  x  (F.3) 


One  way  to  examine  the  relative  flatness  or  concavity  is  to 
look  at  the  locus  of  points  on  the  surface  for  constant 

values  of  y ;  in  particular  when  y=l.   Recognize  that  A  can 

T 
be  rewritten  as  Q  A  Q   where  _A  is  a  diagonal  matrix  of 

eigenvalues  and  Q  is  a  matrix  whose  columns  are  the  unit 

length  eigenvectors  of  A.   Now  (F.3)  becomes 


xTQ  A  QT  x  =  1  (F.4) 


T 
and  introducing  a  new  set  of  variables  w=Q  x  (which  are 

simply  a  rotation  of  the  variables  in  x) ,  equation  (F.4) 

reduces  to 


A..W?  +  .  .  .  +  X  w2  =  1  (F.5) 

11  n  n 


This  equation  describes  an  ellipsoid  in  n  dimensions  whose 
axes  half  lengths  are  given  by  1//XT  for  1  <   i  <  n.   This 
follows  from  letting  all  but  one  of  the  w's  equal  to  zero 
and  solving  for  the  nonzero  variable  so  that  one  point  on  the 
surface  is  for  example  w,  =  1/ /XT  with  w   =  ..,  =  w   =0. 
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This  point  is  just  a  multiple  of  the  first  eigenvector  of 
A  so  that  in  general,  the  axes  of  the  ellipsoid  point  in 
the  direction  of  the  eigenvectors  of  A  with  half  lengths 
given  by  the  recriprccal  of  the  square  root  of  the  eigen- 
values . 
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APPENDIX  G 
Multichannel  ARMA  Modeling 

Consider  a  system  with  Q   output  signals  [y, (k)...yn  rv\i 

and  Q.  input  signals  [u, (k)  ...  un  (k)].   The  multichannel 

^i 
ARMA  analysis  model  forms  an  estimate  of  the  present  value 

of  each  output  as  a  weighted  combination  of  past  values  of 

all  output  signals  and  past  and  present  values  of  all  input 

signals  . 

^0    N 

y   (k)  =  /_,  2-i  b.  (i)  y.(k-i) 

"i    N 


E/j  a .  ( i 


._)  u.(k-l)  (G.l) 

J=T   1=0    ^n     3 


Define  data  vectors  for  all  the  input  and  output  channels  as 


y  (k)  =  [y  (k-1)  ...  y  (k-N)]T  (G.2a) 

— n        n  J  n 


u+(k)  =  [u  (k)  ...  u  (k-N)]T  (G.2b) 

— n        n         n 


and  embed  them  into  QnN  and  Q.(N+1)  vectors  given  by 

T 
Y  =  [y^k)7    ...    ^  (k)T]  (G.3a) 

T 
U+  =  [u*(k)T   .  .,    u!  (k)T]  (G.3b) 

^i 
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Define  NQQ  x  QQ  and  (N+DQ.  x  QQ  matrices   of  coefficients 
given  by 


3 


and 


-11 


% 


01 


fel. 


^QnQ 


oxo 


(G.Ua) 


A+  = 


where 


b.  . 


+ 
^11 


V 


Cb.-U) 


^1, 


-QiQ 


o 


. .  b. .(N)] 


(G.i+b) 


(G.^c) 


a. .  =  [a. .(0) 

-13      i] 


a. .(N)] 

i] 


(G.4d) 


With  these  definitions,  the  multichannel  ARMA  estimate  for 
the  vector  of  output  signals  becomes 


y(k)1  =  [Y1  ,  U   ] 


B 


A 


(G.5) 


Forming  a  prediction  error  vector  as 


en(k)  =  y(k)  -  y_(k) 


(G.6) 
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and  setting. the  model  coefficients  to  minimize  the  trace  of 
the  prediction  error  covariance  matrix  results  in  a  solu- 
tion given  by 


11      YU 


£  +     *  +  + 

U  Y    U  U 


A 


Rv 

-Yy 


s  + 

U  y 


(G.7) 


In  the  transform  domain,  the  prediction  error  model  is  re- 
presented by 


EQ(z)T  =  Y(z)T  [I-B(z)]-U(z)T  A(z) 


(G.8) 


where  E,  Y  and  U  are  the  transforms  of  the  error,  output 

and  input  vectors  and  the  coefficients  of  the  i,j-th  elements 

of  the  matrix  polynomials  B(z)  and  A(z)  are  the  elements  of 

+ 


the  vectors  b . .  and  a 

-ID      -13 

model  is  then  given  by 


The  multichannel  ARMA  synthesis 


YT(z)  =  U(z)T  A(z)Cl-B(z)]"1 


(G.9) 


with  the  matrix  polynomial  fraction  serving  as  the  generali- 
zation of  the  zero  pole  transfer  function. 
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APPENDIX  H 
Delay  Free  Loops 

To  develop  equation  (4.24)  guaranteeing  the  absence  of 
delay  free  loops  in  the  system  of  Figure  4.8,  consider  the 
equation  for  y„(k). 


yN(k)  =  F[xN(k)]  (H.l) 


Since  F C . ]  is  a  memoryless  nonlinear  function,  proving  that 
x„(k)  is  not  a  function  of  y^(k)  is  equivalent  to  proving 
that  y^Ck)  is  not  a  function  of  itself,  and  therefore  no 
delay  free  loops  exist.   From  equations  (4.17b)  and  (4.17d) 
with  u(k)=0  it  follows  that 


*N(z)  =  I2  I(z)  Ii  XN(z)  (H-2) 


and  the  coefficient  of  yM  at  time  k  on  the  right  hand  side 
is  given  by 


a  =  Lim  £2  T(z)  £,  (H.3) 

Z-voo 

A  nonzero  a.,  indicates  a  dependence  of  x„T.(k)  upon  y...(k) 
lj  r  Ni      r    yNj 

and  clearly  therefore  all  the  main  diagonal  elements  of  a 
must  be  zero  to  avoid  delay  free  loops.   These  elements  are 
the  terms  of  the  (Q-l)-st  principal  minors  of  a.   While  this 
is  a  necessary  condition  it  is  not  sufficient  to  avoid  delay 
free  loops  since  loops  may  exist  through  two  or  more  signals 
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The  condition  that  a.,  a..  =  0  for  all  i,i  such  that 

1  <  i,j  <   Q  and  ±£ j ,  ensures  that  no  delay  free  locp  exist 

through  2  signals  and  is  equivalent  to  requiring  all  terms 

of  the  (Q-2)-nd  principal  minors  of  a,   are  zero.   A  term  of 

a  determinant  [Ref.  63]  is  defined  as  the  product  of  elements 

of  the  matrix  taken  one  from  each  row  and  one  from  each 

column  and  the  determinent  of  the  matrix  is  the  sum  of  all 

possible  terms.   It  is  clear  therefore  that  every  term 

must  contain  a  cycle  such  as  a.. a.,  ...cu.  and  must  there- 

J  lj  jk      li 

fore  be  zero.   Since  the  determinant  consists  of  every 
possible  term,  requiring  that  all  terms  of  the  determinants 
of  the  (Q-i)-th  principal  minors  are  zero  ensures  that  no 
delay  free  loops  exist  through  any  combination  of  i  signals. 
Examining  all  terms  of  the  determinant  of  a  and  all  its 
principal  minors  ensures  that  all  possible  loops  through 
the  Q  signals  in  x„(k)  are  examined.   If  any  delay  free  loop 
exists,  then  at  least  one  of  the  terms  of  one  of  the  deter- 
minants will  be  nonzero. 
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APPENDIX  I 
Nonlinear  ARMA  Modeling  Examples 

The  determination  of  memory  requirements  for  the  non- 
linear ARMA  model  as  well  as  its  applicability  to  systems 
consisting  of  interconnections  of  linear  and  memoryless  non- 
linear subsystems  is  illustrated  here  using  two  examples. 
First  a  cascade  of  linear  and  nonlinear  subsystems  is  con- 
sidered.  Then  a  real  world  example  is  considered  and  a 
nonlinear  ARMA  model  is  proposed  for  the  tracking  behavior 
of  a  phase  locked  loop. 

A.   A  CASCADE  OF  LINEAR  AND  NONLINEAR  SUBSYSTEMS 

Consider  the  system  shown  in  Figure  1.1  where  the  signals 
u(k)  and  y.  „(k)  are  observed.   In  terms  of  the  topology  of 
Figure  4.8,  seven  signals  can  be  identified  (x  , ,  x  9,  y 

^L2'  XN1'  ^Nl  anc^  u^  however  for  convenience  three  of  the 
seven  equations  in  (4.26)  which  specify 


xL1(k)  =  u(k)  (I. la) 


xL2(k)  =  yN1(k)  (I. lb) 

xN1(k)  =  yL1<k)  (I.lc) 

will  not  be  explicitly  written.  In  this  case,  equation 
(4,26)  becomes 
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u(k)- 


A  (z) 

1-B  (z) 


A2(z) 
1-B?(z) 


*yL2(k) 


Figure  1,1,   A  Nonlinear  System 


yL2(k) 


u(k) 


yL1(k) 


yN1(K) 


b2(k)* 


a,(k)* 


0                  a    (k)* 

0                  0 

b1(k)*     0 
F1C«]        0 

yL2(k) 
u(k) 

yL1(k) 
^Nl(k) 

(1.2) 


where  the  finite  memory  representations  of  T-.Cz)  and  T~(z) 
have  been  used.   The  objective  now  is  to  eliminate  a„(k)* 
from  the  upper  right  partition.   Using  the  equations  of  rows 
three  and  four,  the  first  row  can  be  rewritten  as 


yL2(k)=b2(k)»  yL2(k)+a2(k)"F1[a1(k)"u(k)+b1(k)"yL1(k^ 


(1.3) 


Notationally  it  is  difficult  to  write  equation  (1.3)  in  the 
operator  matrix  form  of  equation  (1.2)  because  of  the  non- 
linear function  however,  the  following  representation  is 
adopted. 
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No  further  reductions  are  possible  to  make  the  upper  right 
partition  a  null  matrix  leading  to  the  conclusion  that  for 
a  finite  memory  nonlinear  ARMA  model  to  be  appropriate, 
either  b, (k)  must  be  zero  (the  first  linear  system  must  be 
nonrecursive)  or  yT-.(k)  must  also  be  observed.   Alternately, 
if  b. (k)  is  nonzero,  an  infinite  memory  representation  can 
be  used  for  the  first  linear  system  by  replacing  a, (k)*(.) 
with  h-.(k)5'f(.)  and  b,(k)*(.)  with  zero  in  equation  ( 1 . 4 ) 
indicating  that  an  infinite  memory  nonlinear  ARMA  model  is 
appropriate  when  only  u(k)  and  yT„(k)  are  observed. 

B.   A  NONLINEAR  ARMA  MODEL  FOR  A  PHASE  LOCKED  LOOP 

A  continuous  time  model  for  the  tracking  behavior  of  a 
phase  locked  loop  [Ref.  55]  is  shown  in  Figure  1.2  where: 

0, (t)  is  the  phase  of  the  incoming  signal 

8«(t)  is  the  estimate  of  the  phase  of  the  incoming  signal 

e(t)  is  the  phase  error  signal 

F(s)  is  the  transfer  function  of  the  loop  filter 

K   and  K  are  constants 


Figure  I . 2 


A  nonlinear  model  for  the  tracking 
behavior  of  a  phase-locked  loop. 
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The  model  is  nonlinear  because  of  the  sin  function  in  the 
loop.   Often  the  assumption  is  made  that  e(t)<<Tr/2  so 
that  sin  e(t)  2    e(t)  in  which  case  a  linearized  model  is 
obtained  as 


e2(s) 


K  K  FCs) 


9..  (s)  "  K,K  F(s)+s 
1        1  2 


(1.5) 


A  nonlinear  ARMA  model  for  the  system  can  be  obtained  by 
first  discretizing  the  model  of  Figure  1.2  as  shown  in 
Figure  1.3  where  F(z)  represents  the  discrete  loop  filter 


@1(k) 


-f-  s-^  e(k) 

*Z) 


Asin( . ) 


y  (k) 
n 


-i 


F(z) 


-f- 


-kIH 


-i 


92(k) 


Figure  1.3.   A  discrete  version  of  the  phase- 
locked  loop  model. 


-1 
z 
The  integration  has  been  approximated  as  — y  .   (It  is 

1-z"1 
necessary  to  use  this  Euler  forward  approximation  for 

integration  rather  than  one  such  as  the  trapezoid  rule  to 

avoid  a  delay  free  loop) .   Defining  a  single  linear  system 

as  the  cascade  of  F(z)  and  the  discrete  integration 

-1 


K,K 


12  .   -1 
1-z 


F(z)  = 


A(z) 
l-B(z) 


(1.6) 
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Equation  (4.26)  for  the  phase  locked  loop  becomes 


e2(k) 

91(k) 


yN(k) 

e(k) 


b1(k) 


0 
-1 


0   |  ax(k) 


sin( . ) 

0 


9„(k) 


91(k) 


yNCk) 

e(k) 


(1.7) 


where  it  is  assumed  that  0, (k)  and  9?(k)  are  observed.   Using 
the  relationships  specified  by  rows  3  and  4 ,  this  can  be 
written  as 


9„(k) 


9,  (k) 


yNCk) 

e(k) 


b1(k) 


0 
-1 


sin( . ) 
0 


e2(k) 

91(k) 


yN(k) 

e(k) 


a]_(k) 


sin[-( , ) 

0 


(,)]  |  0   0 
0      0   0 


0    '00 

I 

0    ,00 


92(k) 
01(k) 


yNCk) 

e(k) 


(1.8) 


from  which  it  is  clear  that  a  finite  memory  nonlinear  ARMA 
model  is  appropriate.   The  model  can  be  obtained  from  the 
first  row  in  equation  1.8  by  substituting  a  series  expansion 


248 


for  the  sin  function  and  truncating  it  at  the  degree  desired 
resulting  in 

M 
60(k)=b,(k)*e0(k)   +  an(k)*  2-r    rferr^  ,  (90(k)-9.,  (k)  )2m+1 

2  12  1  ™"i    (  2  m+ 1 )  !       2  1 

m=0 

(1.9) 

where  2M+1  is  the  degree  of  the  series  approximation  to 

sin(.).   (Note  that  lim  A  (z)  =  lim  B-.(z)  =  0  so  that  the 

z->°°  z-*-°° 

right  hand  side  of  equation  (1.9)  involves  only  past  values 

of  the  output  9«(k).)   An  infinite  memory  Volterra  series 

model  for  this  system  has  been  considered  by  Van  Trees 

[Ref.  55], 


249 


APPENDIX  J 
Model  Simulation  Program  Listings 

This  appendix  provides  a  listing  of  the  fortran  model 
simulation  programs  used  in  the  experimental  study  of  the 
lattice  characteristics.   Included  are  the  main  programs 
for  the  batch  processing  ARMA  lattice,  the  adaptive  ARMA 
lattice  and  the  brute  force  model  solution.   Each  main 
program  is  followed  by  a  collection  of  subroutines  used  only 
by  that  specific  main  program.   Then  a  collection  of  common 
subroutines  called  from  two  or  more  locations  in  the  batch 
lattice,  adaptive  lattice  or  brute  force  method  is  listed. 
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